When random effects exists, why are the results same from proc mixed and proc glm procedure ?
Say,This is a nonreplicated two-way cross-over study.
random effects is subject nested in sequence; fixed effects are treatment, sequence, period .
I mocked balanced data to try the proc mixed and proc glm , and find the same coefficient estimate there.
My main question is that as we know the
proc mixed treats random effect :subject within sequence as random effect
and
proc glm treats random effect as fixed effect.
Why for balanced data , they will produce the same result ?????
what is the rationale behind it. I have googled a lot of information, none of them provide me specific information.
So, if you know the answer, please help me! Thanks in advance .
The SAS DATASET, CODE AND RESULT are attached blow. (This is a mock).
DATA TRY;
INPUT SUBJECT$ TREATMENT$ PERIOD SEQUENCE CONC;
DATALINES;
001 A 1 1 2.90
002 A 1 1 3.14
003 A 1 1 3.49
004 A 1 1 5.28
005 B 1 2 2.39
006 B 1 2 3.7
007 A 1 1 3.68
008 B 1 2 1.8
009 B 1 2 2.28
010 B 1 2 2.44
001 B 2 1 2.65
002 B 2 1 1.96
003 B 2 1 3.18
004 B 2 1 3.66
005 A 2 2 3.83
006 A 2 2 4.62
007 B 2 1 2.22
008 A 2 2 3.5
009 A 2 2 1.76
010 A 2 2 4.88
;
RUN;
ods output lsmeans=result2;
ods output lsmeandiffcl=result1;
ods output overallanova=result3;
/*****/PROC GLM DATA=TRY;
CLASS TREATMENT PERIOD SEQUENCE SUBJECT;
MODEL CONC=TREATMENT PERIOD SEQUENCE SUBJECT(SEQUENCE)/SOLUTION;
RANDOM SUBJECT(SEQUENCE);
LSMEANS TREATMENT/STDERR PDIFF=control("A","B") CL ALPHA=0.1 ADJUST=T;
RUN;
/***/
ods output Estimates=result1;
ods output LSMeans=result2 ;
ods output Diffs=result3 ;
PROC MIXED DATA=TRY;
CLASS TREATMENT PERIOD SEQUENCE SUBJECT;
MODEL CONC=TREATMENT PERIOD SEQUENCE/SOLUTION;
RANDOM SUBJECT(SEQUENCE);
LSMEANS TREATMENT/PDIFF=control("A","B") CL ALPHA=0.1 ;
RUN;
The SAS System |
The GLM Procedure
Class Level Information | ||
Class | Levels | Values |
TREATMENT | 2 | A B |
PERIOD | 2 | 1 2 |
SEQUENCE | 2 | 1 2 |
SUBJECT | 10 | 001 002 003 004 005 006 007 008 009 010 |
Number of Observations Read | 20 |
Number of Observations Used | 20 |
The SAS System |
The GLM Procedure
Dependent Variable: CONC
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
Model | 11 | 16.14470000 | 1.46770000 | 3.59 | 0.0402 |
Error | 8 | 3.27122000 | 0.40890250 |
|
|
Corrected Total | 19 | 19.41592000 |
|
|
|
R-Square | Coeff Var | Root MSE | CONC Mean |
0.831519 | 20.18481 | 0.639455 | 3.168000 |
Source | DF | Type I SS | Mean Square | F Value | Pr > F |
TREATMENT | 1 | 5.83200000 | 5.83200000 | 14.26 | 0.0054 |
PERIOD | 1 | 0.06728000 | 0.06728000 | 0.16 | 0.6956 |
SEQUENCE | 1 | 0.04608000 | 0.04608000 | 0.11 | 0.7457 |
SUBJECT(SEQUENCE) | 8 | 10.19934000 | 1.27491750 | 3.12 | 0.0641 |
Source | DF | Type III SS | Mean Square | F Value | Pr > F |
TREATMENT | 1 | 5.83200000 | 5.83200000 | 14.26 | 0.0054 |
PERIOD | 1 | 0.06728000 | 0.06728000 | 0.16 | 0.6956 |
SEQUENCE | 1 | 0.04608000 | 0.04608000 | 0.11 | 0.7457 |
SUBJECT(SEQUENCE) | 8 | 10.19934000 | 1.27491750 | 3.12 | 0.0641 |
Parameter | Estimate |
| Standard Error | t Value | Pr > |t| |
Intercept | 3.178000000 | B | 0.49531959 | 6.42 | 0.0002 |
TREATMENT A | 1.080000000 | B | 0.28597290 | 3.78 | 0.0054 |
TREATMENT B | 0.000000000 | B | . | . | . |
PERIOD 1 | -0.116000000 | B | 0.28597290 | -0.41 | 0.6956 |
PERIOD 2 | 0.000000000 | B | . | . | . |
SEQUENCE 1 | -0.710000000 | B | 0.63945485 | -1.11 | 0.2991 |
SEQUENCE 2 | 0.000000000 | B | . | . | . |
SUBJECT(SEQUENCE) 001 1 | -0.175000000 | B | 0.63945485 | -0.27 | 0.7913 |
SUBJECT(SEQUENCE) 002 1 | -0.400000000 | B | 0.63945485 | -0.63 | 0.5490 |
SUBJECT(SEQUENCE) 003 1 | 0.385000000 | B | 0.63945485 | 0.60 | 0.5638 |
SUBJECT(SEQUENCE) 004 1 | 1.520000000 | B | 0.63945485 | 2.38 | 0.0448 |
SUBJECT(SEQUENCE) 007 1 | 0.000000000 | B | . | . | . |
SUBJECT(SEQUENCE) 005 2 | -0.550000000 | B | 0.63945485 | -0.86 | 0.4148 |
SUBJECT(SEQUENCE) 006 2 | 0.500000000 | B | 0.63945485 | 0.78 | 0.4568 |
SUBJECT(SEQUENCE) 008 2 | -1.010000000 | B | 0.63945485 | -1.58 | 0.1529 |
SUBJECT(SEQUENCE) 009 2 | -1.640000000 | B | 0.63945485 | -2.56 | 0.0334 |
SUBJECT(SEQUENCE) 010 2 | 0.000000000 | B | . | . | . |
Note: | The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. |
The SAS System |
The GLM Procedure
Source | Type III Expected Mean Square |
TREATMENT | Var(Error) + Q(TREATMENT) |
PERIOD | Var(Error) + Q(PERIOD) |
SEQUENCE | Var(Error) + 2 Var(SUBJECT(SEQUENCE)) + Q(SEQUENCE) |
SUBJECT(SEQUENCE) | Var(Error) + 2 Var(SUBJECT(SEQUENCE)) |
The SAS System |
The GLM Procedure
Least Squares Means
TREATMENT | CONC LSMEAN | Standard Error | H0:LSMEAN=0 | H0:LSMean1=LSMean2 |
Pr > |t| | Pr > |t| | |||
A | 3.70800000 | 0.20221338 | <.0001 | 0.0054 |
B | 2.62800000 | 0.20221338 | <.0001 |
|
TREATMENT | CONC LSMEAN | 90% Confidence Limits | |
A | 3.708000 | 3.331975 | 4.084025 |
B | 2.628000 | 2.251975 | 3.004025 |
Least Squares Means for Effect TREATMENT | ||||
i | j | Difference Between | 90% Confidence Limits for LSMean(i)-LSMean(j) | |
2 | 1 | -1.080000 | -1.611780 | -0.548220 |
Proc mixed
The SAS System |
The Mixed Procedure
Model Information | |
Data Set | WORK.TRY |
Dependent Variable | CONC |
Covariance Structure | Variance Components |
Estimation Method | REML |
Residual Variance Method | Profile |
Fixed Effects SE Method | Model-Based |
Degrees of Freedom Method | Containment |
Class Level Information | ||
Class | Levels | Values |
TREATMENT | 2 | A B |
PERIOD | 2 | 1 2 |
SEQUENCE | 2 | 1 2 |
SUBJECT | 10 | 001 002 003 004 005 006 007 008 009 010 |
Dimensions | |
Covariance Parameters | 2 |
Columns in X | 7 |
Columns in Z | 10 |
Subjects | 1 |
Max Obs Per Subject | 20 |
Number of Observations | |
Number of Observations Read | 20 |
Number of Observations Used | 20 |
Number of Observations Not Used | 0 |
Iteration History | |||
Iteration | Evaluations | -2 Res Log Like | Criterion |
0 | 1 | 50.47676453 |
|
1 | 1 | 48.01890254 | 0.00000000 |
Convergence criteria met. |
Covariance Parameter Estimates | |
Cov Parm | Estimate |
SUBJECT(SEQUENCE) | 0.4330 |
Residual | 0.4089 |
Fit Statistics | |
-2 Res Log Likelihood | 48.0 |
AIC (smaller is better) | 52.0 |
AICC (smaller is better) | 52.9 |
BIC (smaller is better) | 52.6 |
Solution for Fixed Effects | ||||||||
Effect | TREATMENT | PERIOD | SEQUENCE | Estimate | Standard Error | DF | t Value | Pr > |t| |
Intercept |
|
|
| 2.6380 | 0.4103 | 8 | 6.43 | 0.0002 |
TREATMENT | A |
|
| 1.0800 | 0.2860 | 8 | 3.78 | 0.0054 |
TREATMENT | B |
|
| 0 | . | . | . | . |
PERIOD |
| 1 |
| -0.1160 | 0.2860 | 8 | -0.41 | 0.6956 |
PERIOD |
| 2 |
| 0 | . | . | . | . |
SEQUENCE |
|
| 1 | 0.09600 | 0.5050 | 8 | 0.19 | 0.8540 |
SEQUENCE |
|
| 2 | 0 | . | . | . | . |
Type 3 Tests of Fixed Effects | ||||
Effect | Num DF | Den DF | F Value | Pr > F |
TREATMENT | 1 | 8 | 14.26 | 0.0054 |
PERIOD | 1 | 8 | 0.16 | 0.6956 |
SEQUENCE | 1 | 8 | 0.04 | 0.8540 |
Least Squares Means | |||||||||
Effect | TREATMENT | Estimate | Standard Error | DF | t Value | Pr > |t| | Alpha | Lower | Upper |
TREATMENT | A | 3.7080 | 0.2902 | 8 | 12.78 | <.0001 | 0.1 | 3.1684 | 4.2476 |
TREATMENT | B | 2.6280 | 0.2902 | 8 | 9.06 | <.0001 | 0.1 | 2.0884 | 3.1676 |
Differences of Least Squares Means | ||||||||||
Effect | TREATMENT | _TREATMENT | Estimate | Standard Error | DF | t Value | Pr > |t| | Alpha | Lower | Upper |
TREATMENT | B | A | -1.0800 | 0.2860 | 8 | -3.78 | 0.0054 | 0.1 | -1.6118 | -0.5482 |
You can find the coefficient estimate for treatment A and B are almost the same values from prc mixed and proc glm .
Yes, Absolutely. My question is why it is the same reslut from prc glm and proc mixed.
Because your data and model are balanced, the point estimates should be the same (at least to a few decimal places). However, variance estimates are quite different. Look at the size of the standard errors for the least squares means. For GLM (a narrow inference approach), they are 0.2022..., while for MIXED (a broad inference approach), they are 0.2902, an increase of almost 45%. Note also that the F test for sequence differs for the two approaches, due to the nesting of subjects within treatment*period.
That is the difference. Take a look at Littell et al.'s SAS for Mixed Models, 2nd ed. for additional material that compares GLM to MIXED.
Steve Denham
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.