BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
opiczak
Fluorite | Level 6

Hello,

is there a way to handle properly a mixed model with uneven number of measurements per subject and (more importantly) with uneven time intervals between measurements which are taken at different time points (the dataset contains observations during several years)?

 

I have data from artificial insemination stations about quality of ejaculate of boars. The goal is to determine if a particular mutation (snp) affects quality of ejaculate. Sample data:

 

boar_id   breed   year    snp   age(days)   y(quality)    measurement_order         

1             A          2020   AA     250             330            1        

1             A          2020   AA     265             290            2     

2             B          2016   AA     330             400           1 

2             B          2016   AA     350             385           2 

2             B          2017   AA     365             360           3

 

The biggest problem I see is that measurement 1 for boar 1 is a completely different time point (date) than measurement 1 for boar 2. 

 

I wanted to try something like this:

 

 

proc mixed data=have;
	class boar_id breed year snp measurement;
	model y = age interval breed year snp measurement;
	repeated measurement / subject=boar_id(snp) type=SP(POW);
run; 

 

 

 where interval would mean number of days from the last measurement. But I am not sure if it can fix the problem above. 

 

I also considered giving every unique date of observation its "serial number" (so instead of measurement_order I would use a time point from 1 to n), but then I end up having thousands of levels for fixed effect of time...

 

So is there a solution?  

 

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Here is a link to an example in PROC GLIMMIX that may be useful:https://documentation.sas.com/doc/en/statug/15.2/statug_glimmix_examples09.htm . This example analyzes body weight in cows measured at 23 unequally spaced time points.  In your case, the difficult part will be calculating all of the timepoints observed, and inserting missing values. Then you will have to hope that you can get the model to converge. That is the equivalent to your "serial number" analysis.

 

An alternative approach would be a generalized estimating approach. PROC GEE Example 49.3 Weighted GEE for Longitudinal Data That Have Missing Values https://documentation.sas.com/doc/en/statug/15.2/statug_gee_examples03.htm . This very clever method uses the MISSMODEL statement to handle the probability that a measurement is missing. As your PROC MIXED is set up to estimate the marginal effects in the model, the GEE approach to a marginal model may be more tractable.

 

SteveDenham

 

View solution in original post

2 REPLIES 2
SteveDenham
Jade | Level 19

Here is a link to an example in PROC GLIMMIX that may be useful:https://documentation.sas.com/doc/en/statug/15.2/statug_glimmix_examples09.htm . This example analyzes body weight in cows measured at 23 unequally spaced time points.  In your case, the difficult part will be calculating all of the timepoints observed, and inserting missing values. Then you will have to hope that you can get the model to converge. That is the equivalent to your "serial number" analysis.

 

An alternative approach would be a generalized estimating approach. PROC GEE Example 49.3 Weighted GEE for Longitudinal Data That Have Missing Values https://documentation.sas.com/doc/en/statug/15.2/statug_gee_examples03.htm . This very clever method uses the MISSMODEL statement to handle the probability that a measurement is missing. As your PROC MIXED is set up to estimate the marginal effects in the model, the GEE approach to a marginal model may be more tractable.

 

SteveDenham

 

opiczak
Fluorite | Level 6

Thank you, Steve. Again. The example is great and I will definitively look into it.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2660 views
  • 2 likes
  • 2 in conversation