BookmarkSubscribeRSS Feed
SAS_User13
Calcite | Level 5

Hello all

 

I hope this message finds you well. Apologies for the long question.

 

I have never analyzed a longitudinal dataset before. I am going through materials from classes done years ago (and materials online), and I am still a bit confused, and would like some help to make sure I am approaching this problem from the right perspective. Any help would be greatly appreciated.

 

I have a large dataset which looks at size change over time calculated from images. Below I am including fictitious data for 5 patients which reflect the structure of the overall dataset.

 

ID

sex

Size

race

image_occasion

time

timesq

1

0

20

2

1

0

0

1

0

23

2

2

13

169

1

0

12

2

3

15

225

1

0

22

2

4

18

324

1

0

25

2

5

29

841

1

0

24

2

6

104

10816

1

0

25

2

7

112

12544

1

0

28

2

8

117

13689

1

0

33

2

9

118

13924

2

1

20

1

1

0

0

2

1

26

1

2

3

9

2

1

26

1

3

9

81

2

1

28

1

4

15

225

2

1

33

1

5

21

441

2

1

29

1

6

27

729

2

1

31

1

7

37

1369

2

1

35

1

8

43

1849

2

1

27

1

9

57

3249

3

1

20

1

1

0

0

3

1

15

1

2

29

841

3

1

12

1

3

62

3844

4

0

22

2

1

0

0

4

0

23

2

2

1

1

4

0

40

2

3

4

16

4

0

18

2

4

6

36

5

1

17

2

2

0

0

5

1

23

2

4

35

1225

 

 

The meaning of the variables (except the ones that are self-explanatory) are:

  • image_occasion: denotes the different occasions when different images were done on the same individual. Each of these images measured the size
  • time: is the time of each image (thus each size measurement) from baseline, measured in months. Time=0 is the baseline (first) image (measurement)
  • timesq: is simply time*time

 

I would like to model the change in size over time with repeated measurements, adjusted for other baseline variables and then plot a graph to show this.

 

This dataset is clearly unbalanced, because each patient has had measurement at different times from baseline, and each patient has had a different number of images/measurements.

 

For this reason, my understanding is that the best approach to model it is to used a Random Effects Linear Mixed Effects Model with PROC MIXED.

 

I have a few questions, if you can help me:

 

QUESTION 1: Should this be a “RANDOM intercept” or a “RANDOM intercept time” model? Thus, should I have only random intercepts, or random intercepts and slopes?

 

Should it be:

 

Proc mixed data=mydata;

Class id image_occasion sex race;

Model size= time / s chisq;

Random intercept / type =un subject=record_id ;

run;

 

or

 

Proc mixed data=mydata;

Class id image_occasion sex race;

Model size= time / s chisq;

Random intercept time / type =un subject=record_id ;

run;

 

I think I should use “RANDOM intercept.” With this model I am assuming that even though each patients starts at a different “intercept” (different size) their growth over time is roughly similar. Is this correct?

Of course, when I add other variables to the model (for example sex and race) and create a multivariable model, the interpretation becomes a bit more complex, but broadly speaking that is the meaning, right?

In this case, I can have a summary result for the population (fixed effects). Is this right?

 

If instead, I build the “RANDOM intercept time” model, then I am assuming that even the slope of each individual is different in time. In that case, it would be more difficult to have a summary result for the population (fixed effects). Is this right?

 

I am not asking now about the covariance model. I was planning on choosing between the different options based on the AIC value once I choose the correct model for the mean from above.

 

QUESTION 2: How can I plot the results above, namely the change of size over time from the Proc Mixed regression?

 

First, I would like to have only the mean change of size over time for the whole dataset (crude and multivariable). Then, I will do subgroup analysis in which I stratify for example by sex or other variables.

 

I am using the option “outpm=output_results;” and then I am using “proc sgplot” but I am not sure about the validity of the results (it gives me a very straight line, which I am not sure reflects the data).

In addition, when I do a multivariable model, this method does not work, because it gives me results for each individual patient, or at least gives many many different lines, which I don’t understand what exactly they are.

 

I have also tried the option of doing:

“store output_results2;”

at the end of the Proc Mixed command, and then use the following:

proc plm restore= output_results2;                 

   effectplot fit(x=time);

   run;

 

This seems to work better, but still I am not convinced it is the right approach.

 

Can you please help me determine which would be the best approach to use in this case? I have spent two weeks trying to figure this out.

 

 

 

QUESTION 3.

What if when I put the “timesq” variable in the model, that is significant (p<0.05)? It would suggest that size change over time is not linear, but quadratic (unless my model above is misspecified), right? In that case, does it change anything with regard to coding of the two questions above? In particular, how to plot a graph that shows this?

 

Below is an image that I took from another paper that looked at a similar outcome. They do not mention that growth was quadratic in time. Rather, they simply say that they used a linear mixed model and that this graph comes from “line plot of overall estimated marginal mean of maximum diameter across time.”

 

I am not sure how they produced this image, but my guess is that my data should look something like this. I am not being able to produce this image, or something similar to it.

 

 

SAS_User13_0-1589302109156.png

 

 

 

 

Any input you might have would be enormously appreciated!

 

Thank you very much

 

2 REPLIES 2
Rick_SAS
SAS Super FREQ

You might want to look at the article, "Visualize a mixed model that has repeated measures or random coefficients" and the follow-up article "Longitudinal data: The mixed model."

 

I realize this doesn't answer all your questions, but it might help you with some of the background you need to get started. 

SAS_User13
Calcite | Level 5
Thank you very much. I had already seen the first one, but will look again more carefully. I will look up the second carefully as well. Thank you

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 2437 views
  • 1 like
  • 2 in conversation