BookmarkSubscribeRSS Feed
stats2554
Calcite | Level 5
I’m currently running a parallel trial with 4 groups, 5 different subjects per group. 8 equally spaced timepoints 1 to 8 hours post treatment. the first level is the control group, compared to each of the 3 treatment groups. I have subject=subject in the repeated line, but I’ve seen someone recently run this analysis using subject=subject*time. Can anyone explain how adding the time interaction would affect the outcome?

class subject treatment time;
model result= treatment time treatment*time;
Repeated time / type=CS subject=??;
lsmeans treatment / adjust=dunnett;
12 REPLIES 12
StatsMan
SAS Super FREQ

With a repeated statement like

repeated time / type=cs subject=subject;

you are modeling the R matrix as a block diagonal structure with a common covariance on all the observations from the same level of SUBJECT. 

With your data description, if SUBJECT takes on a unique value for for each unique subject in your data (ie, subjects 1,2,3 in block a and subjects 4,5,6 in block b), then this syntax will be fine If you have subjects nested in blocks (ie, subjects 1,2,3 in block a and subjects 1,2,3 in block b) and subject 1 in block a is a different subject than subject 1 in block b, then you need subject=subject(block) on your repeated statement,

A statement like

repeated time / type=cs subject=subject*time is likely incorrect. The subject= effect on the repeated statement is the experimental unit on which repeated measures were taken. If time is your repeated measure, then it is unlikely that you took repeated measures over time on the experimental unit subject*time. 

stats2554
Calcite | Level 5
Would it be appropriate to include group=treatment in this case? Thanks!
StatsMan
SAS Super FREQ

GROUP=TREATMENT will estimate the TYPE=CS structure separately for each level of treatment. If you suspect you have heterogeneity in the variances across levels of treatment, then that is the way to handle it. You do need sufficient data to estimate the variances for each treatment. If you are having trouble using GROUP= then that can be a sign that you do not have sufficient data to estimate the variance separately. 

stats2554
Calcite | Level 5
Thanks! My last question is when I remove the subject= statement completely but leave in the repeated time, the results are very similar. Can you explain why the results are similar when it is removed and how the analysis methods are different this way? It actually has better sensitivity to pick up changes, but wouldn’t use it if it was wrong.
StatsMan
SAS Super FREQ

If you have a repeated statement like

repeated time / type=cs;

then you are not estimating anything in the r-matrix except a diagonal element. MIXED might partition the variance into two components (with TYPE=CS), but the components will sum to the residual variance you obtain when you drop the repeated statement completely. So, this repeated statement is the same as having no repeated statement at all. Specifically, this repeated statement is not correlating any observations in your data. 

jiltao
SAS Super FREQ

I used the data and program below for your questions:

 

data pr;
input Person Gender $ y1 y2 y3 y4;
y=y1; Age=8; output;
y=y2; Age=10; output;
y=y3; Age=12; output;
y=y4; Age=14; output;
drop y1-y4;
datalines;
1 F 21.0 20.0 21.5 23.0
2 F 21.0 21.5 24.0 25.5
3 F 20.5 24.0 24.5 26.0
4 F 23.5 24.5 25.0 26.5
5 F 21.5 23.0 22.5 23.5
6 F 20.0 21.0 21.0 22.5
7 F 21.5 22.5 23.0 25.0
8 F 23.0 23.0 23.5 24.0
9 F 20.0 21.0 22.0 21.5
10 F 16.5 19.0 19.0 19.5
11 F 24.5 25.0 28.0 28.0
12 M 26.0 25.0 29.0 31.0
13 M 21.5 22.5 23.0 26.5
14 M 23.0 22.5 24.0 27.5
15 M 25.5 27.5 26.5 27.0
16 M 20.0 23.5 22.5 26.0
17 M 24.5 25.5 27.0 28.5
18 M 22.0 22.0 24.5 26.5
19 M 24.0 21.5 24.5 25.5
20 M 23.0 20.5 31.0 26.0
21 M 27.5 28.0 31.0 31.5
22 M 23.0 23.0 23.5 25.0
23 M 21.5 23.5 24.0 28.0
24 M 17.0 24.5 26.0 29.5
25 M 22.5 25.5 25.5 26.0
26 M 23.0 24.5 26.0 30.0
27 M 22.0 21.5 23.5 25.0
;

proc mixed data=pr ;
class Person Gender age;
model y = Gender Age Gender*Age / s;
repeated age / type=cs subject=Person r;
run;

 

proc mixed data=pr ;
class Person Gender age;
model y = Gender Age Gender*Age / s;
repeated age / type=cs subject=Person*age r;
run;

 

You can see that the above two programs produced the same results. Generally speaking, the second program have a repeated statement that is too complex than necessary. It seems to be okay for this model, but if you use a different covariance structure, such as AR(1) or UN, you might see something like the following-

NOTE: Convergence criteria met but final Hessian is not positive definite.

So I would use the first model specification.

 

 

stats2554
Calcite | Level 5
Thanks! If the subject= statement is not included and the data is not correlated, is this still considered a “repeated measures” analysis? The results are giving pvalues for the treatment*time interaction that are indicating the correct trend is being detected, but I wasn’t sure if there was real utility in doing such an analysis (picking up the same trends, with less power?). Thanks!
jiltao
SAS Super FREQ

I am not sure about your question -- no subject= option or no repeated statement? What is the model now? And is your data a repeated measures?

 

Thanks,

Jill

stats2554
Calcite | Level 5
It is repeated measures data at four timepoints 1-4 hours. Without using a subject=option my data is giving me the correct significance for treatment and treatment*time interaction…except the pvalues are lower than when a subject= option is entered. So I’m curious to how it is treating the data in the four timepoints without defining subject, especially since I’m getting results similar to what is given with a subject= option in the repeated line.
jiltao
SAS Super FREQ

Are you comparing the models with --

 

repeated age / type=cs ;

and

repeated age / type=cs subject=Person;

Or something else?

stats2554
Calcite | Level 5
Correct, both of those models with proc mixed (just using a different data set). I’m curious as to how sas treats the data differently using the first vs 2nd model you mention. I’m curious because it’s still showing a valid significant treatmentxtime interaction as it should, and the results are not far off when I add dunnetts test to compare lsmeans between treatments. So it seems to still be making valid inferences…I just not sure how it is treating the data at repeated timepoints without setting a subject= option in the repeated statement.
jiltao
SAS Super FREQ

repeated age / type=cs;

just estimates the residual variances and break it into two pieces. So your R matrix is a constant variance (sum of the two estimates) and zero covariance.

repeated age / type=cs subject=person;

estimates a block-diagonal covariance matrix R, with each block corresponds to each person. Within each block, the structure is a compound symmetry .

You can add the R option in the REPEATED statement to see the R matrix in the output.

If you want to model the correlations in your data, you might need to use the second specification (with the SUBJECT= option). And go with whatever this model produces for your fixed effects.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 2085 views
  • 3 likes
  • 3 in conversation