Solved: Re: proc mixed, 'type=ar(1)' or 'type=SP(POW)(time)'?

GiaLee · Posted 02-22-2024 08:34 AM

Hi,

I have a few question about using linear mixed model for repeated measurements.

Each subject has different follow-up times and number of measurements. For example:

subject A: day 1, day 8, day 20

subject B: day 7, day 13

subject C: day 3, day 19, day 27

...

I then categorized time into 6-day intervals (0-6, 6-12, 12-18...) and labeled it as "time_category", and used it in the mixed model.

Given that the correlations are expected to be highest between adjacent times and lower between more distant times, could I say that the samples are equally spaced (0-6, 6-12..) and use 'type=ar(1)' ?

Or should I consider their orinial unequally spaced time before the categorization, and use 'type=SP(POW)(time)'?

1. If type=ar(1) is better, should I add "time_category" after the REPEATED?

2. If type=SP(POW)(time) is better, does 'type=sp(pow)' only go with continuous variables? If it does, since I've categorized time into categories, should I turn it into something continuous like 1, 2, 3, 4,...?

3. How could I draw a figure to see the differences in trajectory overtime between different site? (predicted value versus time)

Here is my code:

PROC MIXED DATA = test METHOD = REML COVTEST ;
CLASS site record_id time_category(ref="0-6");
MODEL Score =time_category site time_category*site/ SOLUTION;
RANDOM INTERCEPT / SUBJECT = record_id;
repeated time_category/ type=ar(1) SUBJECT = record_id;
RUN;

Any suggesstions are appreciated, thanks!

jiltao · Posted 02-22-2024 09:27 AM

TYPE=AR(1) does not account for unequal time spacings. TYPE=SP(POW)(time) does.

You might do both and compare the fit statistics to see which model fits your data better.

When using TYPE=SP(POW)(time), you might want to use the original time values --

repeated/ type=sp(pow)(time) SUBJECT = record_id;

For the plot, you might add OUTP=PREDDATA option in the MODEL statement in PROC MIXED. Then use PROC SGPPLOT later --

proc sort data=preddata; by site time; run;

proc sgplot data=preddata;

series y=pred x=time / group=site;

run;

Hope this helps,

Jill

View solution in original post

jiltao · Posted 02-22-2024 09:27 AM

TYPE=AR(1) does not account for unequal time spacings. TYPE=SP(POW)(time) does.

You might do both and compare the fit statistics to see which model fits your data better.

When using TYPE=SP(POW)(time), you might want to use the original time values --

repeated/ type=sp(pow)(time) SUBJECT = record_id;

For the plot, you might add OUTP=PREDDATA option in the MODEL statement in PROC MIXED. Then use PROC SGPPLOT later --

proc sort data=preddata; by site time; run;

proc sgplot data=preddata;

series y=pred x=time / group=site;

run;

Hope this helps,

Jill

GiaLee · Posted 02-22-2024 10:08 AM

Thanks!
I tried both, and they showed nearly same BIC, AICC. Does this mean that the intervals are possible equally spaced, making either approach suitable?

PROC MIXED DATA = test METHOD = REML COVTEST ;
CLASS record_id site time_dichotomous(ref="0-6");
MODEL Score = time_dichotomous site time_dichotomous*site/ SOLUTION;
RANDOM INTERCEPT / SUBJECT = record_id;
repeated time_dichotomous / type=ar(1) sub=record_id;
RUN;

PROC MIXED DATA = test METHOD = REML COVTEST ;
CLASS record_id time_dichotomous(ref="0-6") site;
MODEL Score = time_dichotomous site time_dichotomous*site/ SOLUTION;
RANDOM INTERCEPT / SUBJECT = record_id;
repeated/ type=SP(POW)(TIME_original);
RUN;

jiltao · Posted 02-22-2024 11:43 AM

Your second PROC MIXED program is missing subject=record_id option in the REPEATED statement.

GiaLee · Posted 02-22-2024 12:31 PM

Thanks for pointing out the error. I added this:
repeated time_dichotomous/ type=SP(POW)(TIME_JL) sub=record_id;
Their AIC and BIC are still nearly the same. So I believe both of them should work.

jiltao · Posted 02-22-2024 12:40 PM

yes, it seems that either model works for your data.

GiaLee · Posted 02-29-2024 06:05 PM

Hi, may I ask you another question regarding the "type=SP(POW)(TIME_JL)"?

I would like to test time as a continuous variable:

PROC MIXED DATA =test METHOD = REML COVTEST ;
CLASS record_id age_less55;
MODEL Score = time_JL age_less55 time_JL*age_less55/ SOLUTION OUTpred=PREDDATA;
RANDOM INTERCEPT / SUBJECT = record_id;
repeated/ type=SP(POW)(TIME_JL) sub=record_id;
RUN;

I'm receiving a warning message:

WARNING: The R matrix depends on observation order within subjects. Omitting observations from the analysis because of missing values can affect this matrix. Consider using a classification effect in the REPEATED statement to determine ordering in the R matrix.

It seems to require a classification effect after the REPEATED statement, but my time variable is continuous. Do you have any suggestions for handling this situation? Thank you!

jiltao · Posted 03-01-2024 09:13 AM

You may ignore the warning message in this case.

Thanks,

Jill

GiaLee · Posted 03-01-2024 09:14 AM

Thank you!