Re: Subject in repeated statement for PROC MIXED

stats2554 · Posted 01-31-2023 11:53 PM

I’m currently running a parallel trial with 4 groups, 5 different subjects per group. 8 equally spaced timepoints 1 to 8 hours post treatment. the first level is the control group, compared to each of the 3 treatment groups. I have subject=subject in the repeated line, but I’ve seen someone recently run this analysis using subject=subject*time. Can anyone explain how adding the time interaction would affect the outcome?

class subject treatment time;
model result= treatment time treatment*time;
Repeated time / type=CS subject=??;
lsmeans treatment / adjust=dunnett;

StatsMan · Posted 02-01-2023 08:16 AM

With a repeated statement like

repeated time / type=cs subject=subject;

you are modeling the R matrix as a block diagonal structure with a common covariance on all the observations from the same level of SUBJECT.

With your data description, if SUBJECT takes on a unique value for for each unique subject in your data (ie, subjects 1,2,3 in block a and subjects 4,5,6 in block b), then this syntax will be fine If you have subjects nested in blocks (ie, subjects 1,2,3 in block a and subjects 1,2,3 in block b) and subject 1 in block a is a different subject than subject 1 in block b, then you need subject=subject(block) on your repeated statement,

A statement like

repeated time / type=cs subject=subject*time is likely incorrect. The subject= effect on the repeated statement is the experimental unit on which repeated measures were taken. If time is your repeated measure, then it is unlikely that you took repeated measures over time on the experimental unit subject*time.

stats2554 · Posted 02-01-2023 11:49 AM

Would it be appropriate to include group=treatment in this case? Thanks!

StatsMan · Posted 02-01-2023 12:45 PM

GROUP=TREATMENT will estimate the TYPE=CS structure separately for each level of treatment. If you suspect you have heterogeneity in the variances across levels of treatment, then that is the way to handle it. You do need sufficient data to estimate the variances for each treatment. If you are having trouble using GROUP= then that can be a sign that you do not have sufficient data to estimate the variance separately.

stats2554 · Posted 02-01-2023 01:03 PM

Thanks! My last question is when I remove the subject= statement completely but leave in the repeated time, the results are very similar. Can you explain why the results are similar when it is removed and how the analysis methods are different this way? It actually has better sensitivity to pick up changes, but wouldn’t use it if it was wrong.

StatsMan · Posted 02-01-2023 01:23 PM

If you have a repeated statement like

repeated time / type=cs;

then you are not estimating anything in the r-matrix except a diagonal element. MIXED might partition the variance into two components (with TYPE=CS), but the components will sum to the residual variance you obtain when you drop the repeated statement completely. So, this repeated statement is the same as having no repeated statement at all. Specifically, this repeated statement is not correlating any observations in your data.

jiltao · Posted 02-01-2023 02:44 PM

I used the data and program below for your questions:

data pr;
input Person Gender $ y1 y2 y3 y4;
y=y1; Age=8; output;
y=y2; Age=10; output;
y=y3; Age=12; output;
y=y4; Age=14; output;
drop y1-y4;
datalines;
1 F 21.0 20.0 21.5 23.0
2 F 21.0 21.5 24.0 25.5
3 F 20.5 24.0 24.5 26.0
4 F 23.5 24.5 25.0 26.5
5 F 21.5 23.0 22.5 23.5
6 F 20.0 21.0 21.0 22.5
7 F 21.5 22.5 23.0 25.0
8 F 23.0 23.0 23.5 24.0
9 F 20.0 21.0 22.0 21.5
10 F 16.5 19.0 19.0 19.5
11 F 24.5 25.0 28.0 28.0
12 M 26.0 25.0 29.0 31.0
13 M 21.5 22.5 23.0 26.5
14 M 23.0 22.5 24.0 27.5
15 M 25.5 27.5 26.5 27.0
16 M 20.0 23.5 22.5 26.0
17 M 24.5 25.5 27.0 28.5
18 M 22.0 22.0 24.5 26.5
19 M 24.0 21.5 24.5 25.5
20 M 23.0 20.5 31.0 26.0
21 M 27.5 28.0 31.0 31.5
22 M 23.0 23.0 23.5 25.0
23 M 21.5 23.5 24.0 28.0
24 M 17.0 24.5 26.0 29.5
25 M 22.5 25.5 25.5 26.0
26 M 23.0 24.5 26.0 30.0
27 M 22.0 21.5 23.5 25.0
;

proc mixed data=pr ;
class Person Gender age;
model y = Gender Age Gender*Age / s;
repeated age / type=cs subject=Person r;
run;

proc mixed data=pr ;
class Person Gender age;
model y = Gender Age Gender*Age / s;
repeated age / type=cs subject=Person*age r;
run;

You can see that the above two programs produced the same results. Generally speaking, the second program have a repeated statement that is too complex than necessary. It seems to be okay for this model, but if you use a different covariance structure, such as AR(1) or UN, you might see something like the following-

NOTE: Convergence criteria met but final Hessian is not positive definite.

So I would use the first model specification.

stats2554 · Posted 02-02-2023 03:43 PM

Thanks! If the subject= statement is not included and the data is not correlated, is this still considered a “repeated measures” analysis? The results are giving pvalues for the treatment*time interaction that are indicating the correct trend is being detected, but I wasn’t sure if there was real utility in doing such an analysis (picking up the same trends, with less power?). Thanks!

jiltao · Posted 02-02-2023 03:49 PM

I am not sure about your question -- no subject= option or no repeated statement? What is the model now? And is your data a repeated measures?

Thanks,

Jill

stats2554 · Posted 02-02-2023 04:41 PM

It is repeated measures data at four timepoints 1-4 hours. Without using a subject=option my data is giving me the correct significance for treatment and treatment*time interaction…except the pvalues are lower than when a subject= option is entered. So I’m curious to how it is treating the data in the four timepoints without defining subject, especially since I’m getting results similar to what is given with a subject= option in the repeated line.

jiltao · Posted 02-02-2023 07:24 PM

Are you comparing the models with --

repeated age / type=cs ;

and

repeated age / type=cs subject=Person;

Or something else?

stats2554 · Posted 02-02-2023 08:15 PM

Correct, both of those models with proc mixed (just using a different data set). I’m curious as to how sas treats the data differently using the first vs 2nd model you mention. I’m curious because it’s still showing a valid significant treatmentxtime interaction as it should, and the results are not far off when I add dunnetts test to compare lsmeans between treatments. So it seems to still be making valid inferences…I just not sure how it is treating the data at repeated timepoints without setting a subject= option in the repeated statement.

jiltao · Posted 02-03-2023 10:05 AM

repeated age / type=cs;

just estimates the residual variances and break it into two pieces. So your R matrix is a constant variance (sum of the two estimates) and zero covariance.

repeated age / type=cs subject=person;

estimates a block-diagonal covariance matrix R, with each block corresponds to each person. Within each block, the structure is a compound symmetry .

You can add the R option in the REPEATED statement to see the R matrix in the output.

If you want to model the correlations in your data, you might need to use the second specification (with the SUBJECT= option). And go with whatever this model produces for your fixed effects.

SAS Innovate 2025: Call for Content