With a repeated statement like
repeated time / type=cs subject=subject;
you are modeling the R matrix as a block diagonal structure with a common covariance on all the observations from the same level of SUBJECT.
With your data description, if SUBJECT takes on a unique value for for each unique subject in your data (ie, subjects 1,2,3 in block a and subjects 4,5,6 in block b), then this syntax will be fine If you have subjects nested in blocks (ie, subjects 1,2,3 in block a and subjects 1,2,3 in block b) and subject 1 in block a is a different subject than subject 1 in block b, then you need subject=subject(block) on your repeated statement,
A statement like
repeated time / type=cs subject=subject*time is likely incorrect. The subject= effect on the repeated statement is the experimental unit on which repeated measures were taken. If time is your repeated measure, then it is unlikely that you took repeated measures over time on the experimental unit subject*time.
GROUP=TREATMENT will estimate the TYPE=CS structure separately for each level of treatment. If you suspect you have heterogeneity in the variances across levels of treatment, then that is the way to handle it. You do need sufficient data to estimate the variances for each treatment. If you are having trouble using GROUP= then that can be a sign that you do not have sufficient data to estimate the variance separately.
If you have a repeated statement like
repeated time / type=cs;
then you are not estimating anything in the r-matrix except a diagonal element. MIXED might partition the variance into two components (with TYPE=CS), but the components will sum to the residual variance you obtain when you drop the repeated statement completely. So, this repeated statement is the same as having no repeated statement at all. Specifically, this repeated statement is not correlating any observations in your data.
I used the data and program below for your questions:
data pr;
input Person Gender $ y1 y2 y3 y4;
y=y1; Age=8; output;
y=y2; Age=10; output;
y=y3; Age=12; output;
y=y4; Age=14; output;
drop y1-y4;
datalines;
1 F 21.0 20.0 21.5 23.0
2 F 21.0 21.5 24.0 25.5
3 F 20.5 24.0 24.5 26.0
4 F 23.5 24.5 25.0 26.5
5 F 21.5 23.0 22.5 23.5
6 F 20.0 21.0 21.0 22.5
7 F 21.5 22.5 23.0 25.0
8 F 23.0 23.0 23.5 24.0
9 F 20.0 21.0 22.0 21.5
10 F 16.5 19.0 19.0 19.5
11 F 24.5 25.0 28.0 28.0
12 M 26.0 25.0 29.0 31.0
13 M 21.5 22.5 23.0 26.5
14 M 23.0 22.5 24.0 27.5
15 M 25.5 27.5 26.5 27.0
16 M 20.0 23.5 22.5 26.0
17 M 24.5 25.5 27.0 28.5
18 M 22.0 22.0 24.5 26.5
19 M 24.0 21.5 24.5 25.5
20 M 23.0 20.5 31.0 26.0
21 M 27.5 28.0 31.0 31.5
22 M 23.0 23.0 23.5 25.0
23 M 21.5 23.5 24.0 28.0
24 M 17.0 24.5 26.0 29.5
25 M 22.5 25.5 25.5 26.0
26 M 23.0 24.5 26.0 30.0
27 M 22.0 21.5 23.5 25.0
;
proc mixed data=pr ;
class Person Gender age;
model y = Gender Age Gender*Age / s;
repeated age / type=cs subject=Person r;
run;
proc mixed data=pr ;
class Person Gender age;
model y = Gender Age Gender*Age / s;
repeated age / type=cs subject=Person*age r;
run;
You can see that the above two programs produced the same results. Generally speaking, the second program have a repeated statement that is too complex than necessary. It seems to be okay for this model, but if you use a different covariance structure, such as AR(1) or UN, you might see something like the following-
NOTE: Convergence criteria met but final Hessian is not positive definite.
So I would use the first model specification.
I am not sure about your question -- no subject= option or no repeated statement? What is the model now? And is your data a repeated measures?
Thanks,
Jill
Are you comparing the models with --
repeated age / type=cs ;
and
repeated age / type=cs subject=Person;
Or something else?
repeated age / type=cs;
just estimates the residual variances and break it into two pieces. So your R matrix is a constant variance (sum of the two estimates) and zero covariance.
repeated age / type=cs subject=person;
estimates a block-diagonal covariance matrix R, with each block corresponds to each person. Within each block, the structure is a compound symmetry .
You can add the R option in the REPEATED statement to see the R matrix in the output.
If you want to model the correlations in your data, you might need to use the second specification (with the SUBJECT= option). And go with whatever this model produces for your fixed effects.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.