Example 56.2 uses proc mixed to examine growth measurements for girls and boys at ages 8, 10, 12 and 14. The proposed syntax is:
data pr; input Person Gender $ y1 y2 y3 y4; y=y1; Age=8; output; y=y2; Age=10; output; y=y3; Age=12; output; y=y4; Age=14; output; drop y1-y4; datalines; 1 F 21.0 20.0 21.5 23.0 2 F 21.0 21.5 24.0 25.5 3 F 20.5 24.0 24.5 26.0 4 F 23.5 24.5 25.0 26.5 5 F 21.5 23.0 22.5 23.5 6 F 20.0 21.0 21.0 22.5 7 F 21.5 22.5 23.0 25.0 8 F 23.0 23.0 23.5 24.0 9 F 20.0 21.0 22.0 21.5 10 F 16.5 19.0 19.0 19.5 11 F 24.5 25.0 28.0 28.0 12 M 26.0 25.0 29.0 31.0 13 M 21.5 22.5 23.0 26.5 14 M 23.0 22.5 24.0 27.5 15 M 25.5 27.5 26.5 27.0 16 M 20.0 23.5 22.5 26.0 17 M 24.5 25.5 27.0 28.5 18 M 22.0 22.0 24.5 26.5 19 M 24.0 21.5 24.5 25.5 20 M 23.0 20.5 31.0 26.0 21 M 27.5 28.0 31.0 31.5 22 M 23.0 23.0 23.5 25.0 23 M 21.5 23.5 24.0 28.0 24 M 17.0 24.5 26.0 29.5 25 M 22.5 25.5 25.5 26.0 26 M 23.0 24.5 26.0 30.0 27 M 22.0 21.5 23.5 25.0 ;
proc mixed data=pr method=ml covtest; class Person Gender; model y = Gender Age Gender*Age / s; repeated / type=un subject=Person r; run;
With regards to the 'Solution for Fixed Effects' (see below), the authors conclude that "The girls' starting point is larger than that for the boys, but their growth rate is about half of the boys".
15.8423 | 0.9356 | 25 | 16.93 | <.0001 |
1.5831 | 1.4658 | 25 | 1.08 | 0.2904 |
0 | . | . | . | . |
0.8268 | 0.07911 | 25 | 10.45 | <.0001 |
-0.3504 | 0.1239 | 25 | -2.83 | 0.0091 |
0 | . | . | . | . |
So my question is why age was not included in the class statement?
A proc means analysis for age=8 shows that the value for boys is larger than that for girls. Also below is the solution for fixed effects when age(ref=first) is added to the class statement. Wouldn't this better reflect the data?
Analysis Variable : y Gender N Obs N Mean Std Dev Minimum Maximum F 11 M 16
11 | 21.1818182 | 2.1245320 | 16.5000000 | 24.5000000 |
16 | 22.8750000 | 2.4528895 | 17.0000000 | 27.5000000 |
22.8750 | 0.5598 | 25 | 40.86 | <.0001 |
-1.6932 | 0.8771 | 25 | -1.93 | 0.0650 |
0 | . | . | . | . |
0.9375 | 0.4910 | 25 | 1.91 | 0.0678 |
2.8438 | 0.4842 | 25 | 5.87 | <.0001 |
4.5938 | 0.5369 | 25 | 8.56 | <.0001 |
0 | . | . | . | . |
0.1080 | 0.7693 | 25 | 0.14 | 0.8895 |
-0.9347 | 0.7585 | 25 | -1.23 | 0.2293 |
-1.6847 | 0.8411 | 25 | -2.00 | 0.0561 |
0 | . | . | . | . |
0 | . | . | . | . |
0 | . | . | . | . |
0 | . | . | . | . |
0 | . | . | . | . |
Age is a continuous variable, so the model treated it as such. The authors want one parameter to indicate the dependence on age.
If the subjects were classifed as "Children", "Teenagers", and "Adults", then the variable would be treated as a classification effect. There would be three parameters (two independent parameters) in that model.
If you want to account for possible nonlinearity of response due to age, you could change the code slightly (including age as a class effect) to get:
proc mixed data=pr method=ml covtest;
class Person Gender Age;
model y = Gender Age Gender*Age / s;
repeated Age/ type=un subject=Person r;
run;
Note that this will "use up" some degrees of freedom, so that standard errors may be larger and tests somewhat different. There are many ways to proceed at this point, especially if you wished to make comparisons of expected values at various ages.
Steve Denham
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.