I have a large, longitudinal data set -- seven years of data for approximately 115,000 students (approximately 500,000 observations), attending approximately 600 schools. I am doing cross-classified individual growth modeling (cross-classified because students change schools). Since the data set is so large, I am estimating models with small (2 percent) subsamples of the data set. However, with the 2 percent subsamples, the data are sparse; and so PROC MIXED cannot estimate the covariance parameters. Therefore, I am using a combination of HPMIXED and MIXED. This way I can estimate the covariance parameters in HPMIXED, and then pass them along to MIXED. Then I can compute sequential F-tests and make adjustments for multiple comparisons in testing differences between LSMEANS.
The grouping variable of interest is ProFnc: 1=proficient, 0=non-proficient. When I put the grouping variable in the model with no interactions, the fixed effects produced by HPMIXED and MIXED are the same. However, HPMIXED estimates a fixed effect for ProFnc=1, where as MIXED estimates an effect for ProFnc=0.
Then, when I add to the model interactions with the grouping variable, the fixed effects seem to behave strangely. I don't know if I should ignore the fixed effects that HPMIXED produces, and just go with the fixed effects that MIXED produces. And if that is the case, can I trust the covariance parameters coming out of HPMIXED? Can I trust the fixed effects that MIXED is producing? If there is a problem, is there a change I can make in the code to fix this?
Below I have pasted the code and output for the two models I described above. Again, I first have the model with the grouping variable but no interactions; and second, I have the model with the grouping variable and its interactions. (Note: I realize these particular interactions are not significant, but I am trying to understand what SAS is doing here and which numbers I can trust.)
Thank you in advance for any help with this.
/* MODEL 1 - the grouping variable (ProFnc) and NO interactions */
/* HPMIXED */
PROC HPMIXED DATA=sub2pct_long noclprint;
CLASS stdpseudoid schcode ProFnc;
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
ProFnc
zCeldt /solution ;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
ODS OUTPUT covparms=hpmcov2pct;
RUN;
/* MIXED */
PROC MIXED DATA=sub2pct_long noclprint covtest lognote method=reml;
CLASS stdpseudoid schcode ProFnc;
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
zCeldt
ProFnc /solution htype=1;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
PARMS/PDATA=hpmcov2pct hold=1,2,3,4,5 noiter;
RUN;
THE HPMIXED PROCEDURE | |
Data Set | WORK.SUB2PCT_LONG |
Response Variable | zELA |
Estimation Method | Restricted Maximum Likelihood (REML) |
Degrees of Freedom Method | Residual |
Number of Observations Read | 10175 |
Number of Observations Used | 8898 |
Dimensions | |
G-side Cov. Parameters | 4 |
R-side Cov. Parameters | 1 |
Columns in X | 12 |
Columns in Z | 4969 |
Subjects (Blocks in V) | 1 |
Optimization Information | |
Optimization Technique | Dual Quasi-Newton |
Parameters in Optimization | 4 |
Lower Boundaries | 3 |
Upper Boundaries | 0 |
Residual Variance | Profiled |
Iteration History | ||||
Iterations | Evaluations | Objective Function | Change | Max Gradient |
0 | 4 | 18310.276802 | . 3333 | 425.1763 |
1 | 4 | 18306.028084 | 4.24871820 | 56.80605 |
2 | 3 | 18305.679162 | 0.34892203 | 74.52385 |
3 | 3 | 18305.05507 | 0.62409210 | 39.65128 |
4 | 4 | 18305.032432 | 0.02263798 | 38.72127 |
5 | 4 | 18304.982517 | 0.04991456 | 4.723682 |
6 | 3 | 18304.981792 | 0.00072531 | 0.027503 |
7 | 3 | 18304.981792 | 0.00000002 | 0.000091 |
Convergence criterion (GCONV=1E-8) satisfied. |
Covariance Parameter Estimates | ||
Cov Parm | Subject | Estimate |
UN(1,1) | StdPseudoId | 0.6356 |
UN(2,1) | StdPseudoId | -0.00929 |
UN(2,2) | StdPseudoId | 0.01124 |
UN(1,1) | SchCode | 0.009595 |
Residual | 0.2287 |
Fit Statistics | |
-2 Res Log Likelihood | 18305 |
AIC (smaller is better) | 18315 |
AICC (smaller is better) | 18315 |
BIC (smaller is better) | 18305 |
CAIC (smaller is better) | 18310 |
HQIC (smaller is better) | 18305 |
Solution for Fixed Effects | ||||||
Effect | ProFnc | Estimate | Standard Error | DF | t Value | Pr > |t| |
Intercept | -0.1662 | 0.04025 | 8887 | -4.13 | <.0001 | |
Timec | -0.02220 | 0.03404 | 8887 | -0.65 | 0.5144 | |
Timec*Timec | -0.00221 | 0.02929 | 8887 | -0.08 | 0.9398 | |
Timec*Timec*Timec | 0.002277 | 0.008339 | 8887 | 0.27 | 0.7848 | |
Time*Time*Time*Timec | -0.00020 | 0.000738 | 8887 | -0.26 | OC0.7913 | |
Female | 0.05291 | 0.03625 | 8887 | 1.46 | 0.1444 | |
PrEdClGrd | 0.1948 | 0.09722 | 8887 | 2.00 | 0.0452 | |
PrEdSmClHS | 0.1982 | 0.04100 | 8887 | 4.83 | <.0001 | |
FRL | -0.04520 | 0.03694 | 8887 | -1.22 | 0.2212 | |
ProFnc | 0 | 0 | . | . | . | . |
ProFnc | 1 | 0.2171 | 0.03673 | 8887 | 5.91 | <.0001 |
zCeldt | 0.2361 | 0.01803 | 8887 | 13.10 | <.0001 |
THE MIXED PROCEDURE | |
Data Set | WORK.SUB2PCT_LONG |
Dependent Variable | zELA |
Covariance Structures | Unstructured, Variance Components |
Subject Effects | StdPseudoId, SchCode |
Estimation Method | REML |
Residual Variance Method | Parameter |
Fixed Effects SE Method | Model-Based |
Degrees of Freedom Method | Containment |
Covariance Parameters | 5 |
Columns in X | 20 |
Columns in Z | 4969 |
Subjects | 1 |
Max Obs Per Subject | 10175 |
Number of Observations | |
Number of Observations Read | 10175 |
Number of Observations Used | 8898 |
Number of Observations Not Used | 1277 |
Parameter Search | ||||||
CovP1 | CovP2 | CovP3 | CovP4 | CovP5 | Res Log Like | -2 Res Log Like |
0.6356 | -0.00929 | 0.01124 | 0.009595 | 0.2287 | -9152.4909 | 18304.9818 |
Covariance Parameter Estimates | |||||
Cov Parm | Subject | Estimate | Standard Error | Z Value | Pr Z |
UN(1,1) | StdPseudoId | 0.6356 | 0 | . | . |
UN(2,1) | StdPseudoId | -0.00929 | 0 | . | . |
UN(2,2) | StdPseudoId | 0.01124 | 0 | . | . |
UN(1,1) | SchCode | 0.009595 | 0 | . | . |
Residual | 0.2287 | 0 | . | . |
Fit Statistics | |
-2 Res Log Likelihood | 18305.0 |
AIC (smaller is better) | 18305.0 |
AICC (smaller is better) | 18305.0 |
BIC (smaller is better) | 18305.0 |
Solution for Fixed Effects | ||||||
Effect | ProFnc | Estimate | Standard Error | DF | t Value | Pr > |t| |
Intercept | 0.05087 | 0.04413 | 425 | 1.15 | 0.2497 | |
Timec | -0.02220 | 0.03404 | 2040 | -0.65 | 0.5144 | |
Timec*Timec | -0.00221 | 0.02929 | 4190 | -0.08 | 0.9398 | |
Timec*Timec*Timec | 0.002277 | 0.008339 | 4190 | 0.27 | 0.7848 | |
Time*Time*Time*Timec | -0.00020 | 0.000738 | 4190 | -0.26 | 0.7913 | |
Female | 0.05291 | 0.03625 | 4190 | 1.46 | 0.1444 | |
PrEdClGrd | 0.1948 | 0.09722 | 4190 | 2.00 | 0.0452 | |
PrEdSmClHS | 0.1982 | 0.04100 | 4190 | 4.83 | <.0001 | |
FRL | -0.04520 | 0.03694 | 4190 | -1.22 | 0.2212 | |
ProFnc | 0 | -0.2171 | 0.03673 | 4190 | -5.91 | <.0001 |
ProFnc | 1 | 0 | . | . | . | . |
zCeldt | 0.2361 | 0.01803 | 4190 | 13.10 | <.0001 |
Type I Tests of Fixed Effects | ||||
Effect | Num DF | Den DF | F Value | Pr > F |
Timec | 1 | 2040 | 0.66 | 0.4156 |
Timec*Timec | 1 | 4190 | 10.33 | 0.0013 |
Timec*Timec*Timec | 1 | 4190 | 0.03 | 0.8713 |
Time*Time*Time*Timec | 1 | 4190 | 0.01 | 0.9256 |
Female | 1 | 4190 | 5.38 | 0.0204 |
PrEdClGrd | 1 | 4190 | 3.38 | 0.0659 |
PrEdSmClHS | 1 | 4190 | 31.73 | <.0001 |
FRL | 1 | 4190 | 0.56 | 0.4555 |
ProFnc | 1 | 4190 | 33.85 | <.0001 |
zCeldt | 1 | 4190 | 171.49 | <.0001 |
/* MODEL 2 - the grouping variable (ProFnc), AND its interactions */
/* HPMIXED */
PROC HPMIXED DATA=sub2pct_long noclprint;
CLASS stdpseudoid schcode ProFnc;
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
zCeldt
ProFnc
ProFnc*Female
ProFnc*PrEdClGrd
ProFnc*PrEdSmClHS
ProFnc*zCeldt /solution ;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
ODS OUTPUT covparms=hpmcov2pct;
RUN;
/* MIXED */
PROC MIXED DATA=sub2pct_long noclprint covtest lognote method=reml;
CLASS stdpseudoid schcode ProFnc;
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
ProFnc
zCeldt
ProFnc*Female
ProFnc*PrEdClGrd
ProFnc*PrEdSmClHS
ProFnc*zCeldt /solution htype=1;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
PARMS/PDATA=hpmcov2pct hold=1,2,3,4,5 noiter;
RUN;
THE HPMIXED PROCEDURE | |
Data Set | WORK.SUB2PCT_LONG |
Response Variable | zELA |
Estimation Method | Restricted Maximum Likelihood (REML) |
Degrees of Freedom Method | Residual |
Number of Observations Read | 10175 |
Number of Observations Used | 8898 |
Dimensions | |
G-side Cov. Parameters | 4 |
R-side Cov. Parameters | 1 |
Columns in X | 20 |
Columns in Z | 4969 |
Subjects (Blocks in V) | 1 |
Optimization Information | |
Optimization Technique | Dual Quasi-Newton |
Parameters in Optimization | 4 |
Lower Boundaries | 3 |
Upper Boundaries | 0 |
Residual Variance | Profiled |
Iteration History | ||||
Iteration | Evaluations | Objective Function | Change | Max Gradient |
0 | 4 | 18320.233278 | . | 430.4562 |
1 | 4 | 18315.885135 | 4.34814326 | 57.74109 |
2 | 3 | 18315.533274 | 0.35186101 | 75.8847 |
3 | 3 | 18314.895565 | 0.63770890 | 40.89073 |
4 | 4 | 18314.872387 | 0.02317834 | 39.85029 |
5 | 4 | 18314.819313 | 0.05307395 | 3.502647 |
6 | 3 | 18314.818918 | 0.00039410 | 0.020809 |
7 | 3 | 18314.818918 | 0.00000001 | 0.000075 |
Convergence criterion (GCONV=1E-8) satisfied. |
Covariance Parameter Estimates | ||
CovParm | Subject | Estimate |
UN(1,1) | StdPseudoId | 0.6363 |
UN(2,1) | StdPseudoId | -0.00940 |
UN(2,2) | StdPseudoId | 0.01123 |
UN(1,1) | SchCode | 0.009609 |
Residual | 0.2287 |
Fit Statistics | |
-2 Res Log Likelihood | 18315 |
AIC (smaller is better) | 18325 |
AICC (smaller is better) | 18325 |
BIC (smaller is better) | 18315 |
CAIC (smaller is better) | 18320 |
HQIC (smaller is better) | 18315 |
Solution for Fixed Effects | ||||||
Effects | ProFnc | Estimate | Standard Error | DF | t Value | Pr > |t| |
Intercept | -0.1545 | 0.04406 | 8883 | -3.51 | 0.0005 | |
Timec | -0.02224 | 0.03404 | 8883 | -0.65 | 0.5136 | |
Timec*Timec | -0.00219 | 0.02929 | 8883 | -0.07 | 0.9404 | |
Timec*Timec*Timec | 0.002277 | 0.008339 | 8883 | 0.27 | 0.7848 | |
Time*Time*Time*Timec | -0.00020 | 0.000738 | 8883 | -0.27 | 0.7909 | |
Female | 0 | . | . | . | . | |
PrEdClGrd | 0 | . | . | . | . | |
PrEdSmClHS | 0 | . | . | . | . | |
FRL | -0.04623 | 0.03708 | 8883 | -1.25 | 0.2125 | |
ProFnc | 0 | 0 | . | . | . | . |
ProFnc | 1 | 0.1951 | 0.05881 | 8883 | 3.32 | 0.0009 |
zCeldt | 0 | . | . | . | . | |
Female*ProFnc | 0 | 0.06340 | 0.04752 | 8883 | 1.33 | 0.1822 |
Female*ProFnc | 1 | 0.03835 | 0.05611 | 8883 | 0.68 | 0.4943 |
PrEdClGrd*ProFnc | 0 | 0.1479 | 0.1253 | 8883 | 1.18 | 0.2377 |
PrEdClGrd*ProFnc | 1 | 0.2608 | 0.1544 | 8883 | 1.69 | 0.0911 |
PrEdSmClHS*ProFnc | 0 | 0.1508 | 0.05290 | 8883 | 2.85 | 0.0044 |
PrEdSmClHS*ProFnc | 1 | 0.2672 | 0.06477 | 8883 | 4.13 | <.0001 |
zCeldt*ProFnc | 0 | 0.2515 | 0.02414 | 8883 | 10.42 | <.0001 |
zCeldt*ProFnc | 1 | 0.2173 | 0.02710 | 8883 | 8.02 | <.0001 |
THE MIXED PROCEDURE | |
Data Set | WORK.SUB2PCT_LONG |
Dependent Variable | zELA |
Covariance Structures | Unstructured, Variance Components |
Subject Effects | StdPseudoId, SchCode |
Estimation Method | REML |
Residual Variance Method | Parameter |
Fixed Effects SE Method | Model-Based |
Degrees of Freedom Method | Containment |
Dimensions | |
Covariance Parameters | 5 |
Columns in X | 20 |
Columns in Z | 4969 |
Subjects | 1 |
Max Obs Per Subject | 10175 |
Number of Observations Read | 10175 |
Number of Observations Used | 8898 |
Number of Observations Not Used | 1277 |
Parameter Search | ||||||
CovP1 | CovP2 | CovP3 | CovP4 | CovP5 | Res Log Like | -2 Res Log Like |
0.6363 | -0.00940 | 0.01123 | 0.009609 | 0.2287 | -9146.8531 | 18293.7063 |
Covariance Parameter Estimates | |||||
Cov Parm | Subject | Estimate | Standard Error | Z Value | Pr Z |
UN(1,1) | StdPseudoId | 0.6363 | 0 | . | . |
UN(2,1) | StdPseudoId | -0.00940 | 0 | . | . |
UN(2,2) | StdPseudoId | 0.01123 | 0 | . | . |
UN(1,1) | SchCode | 0.009609 | 0 | . | . |
Residual | 0.2287 | 0 | . | . |
Fit Statistics | |
-2 Res Log Likelihood | 18293.7 |
AIC (smaller is better) | 18293.7 |
AICC (smaller is better) | 18293.7 |
BIC (smaller is better) | 18293.7 |
Solution for Fixed Effects | ||||||
Effect | ProFnc | Estimate | Standard Error | DF | t Value | Pr > |t| |
Intercept | 0.03238 | 15327 | 425 | 0.00 | 1.0000 | |
Timec | -0.02224 | 0.03404 | 2040 | -0.65 | 0.5136 | |
Timec*Timec | -0.00219 | 0.02929 | 4190 | -0.07 | 0.9404 | |
Timec*Timec*Timec | 0.002277 | 0.008339 | 4190 | 0.27 | 0.7848 | |
Time*Time*Time*Timec | -0.00020 | 0.000738 | 4190 | -0.27 | 0.7909 | |
Female | 0.03835 | 0.05611 | 4190 | 0.68 | 0.4943 | |
PrEdClGrd | 0.2608 | 0.1544 | 4190 | 1.69 | 0.0911 | |
PrEdSmClHS | 0.2672 | 0.06477 | 4190 | 4.13 | <.0001 | |
FRL | -0.04623 | 0.03708 | 4190 | -1.25 | 0.2125 | |
ProFnc | 0 | -0.1869 | 15327 | 4190 | -0.00 | 1.0000 |
ProFnc | 1 | 0.008199 | 15327 | 4190 | 0.00 | 1.0000 |
zCeldt | 0.2173 | 0.02710 | 4190 | 8.02 | <.0001 | |
Female*ProFnc | 0 | 0.02505 | 0.07352 | 4190 | 0.34 | 0.7333 |
Female*ProFnc | 1 | 0 | . | . | . | . |
PrEdClGrd*ProFnc | 0 | -0.1129 | 0.1988 | 4190 | -0.57 | 0.5701 |
PrEdClGrd*ProFnc | 1 | 0 | . | . | . | . |
PrEdSmClHS*ProFnc | 0 | -0.1165 | 0.08351 | 4190 | -1.39 | 0.1631 |
PrEdSmClHS*ProFnc | 1 | 0 | . | . | . | . |
zCeldt*ProFnc | 0 | 0.03423 | 0.03626 | 4190 | 0.94 | 0.3453 |
zCeldt*ProFnc | 1 | 0 | . | . | . | . |
Type I Tests of Fixed Effects | ||||
Effect | Num DF | Den DF | F Value | Pr > F |
Timec | 1 | 2040 | 0.63 | 0.4270 |
Timec*Timec | 1 | 4190 | 10.39 | 0.0013 |
Timec*Timec*Timec | 1 | 4190 | 0.01 | 0.9113 |
Time*Time*Time*Timec | 1 | 4190 | 0.01 | 0.9377 |
Female | 1 | 4190 | 5.35 | 0.0208 |
PrEdClGrd | 1 | 4190 | 3.39 | 0.0657 |
PrEdSmClHS | 1 | 4190 | 31.90 | <.0001 |
FRL | 1 | 4190 | 0.52 | 0.4690 |
ProFnc | 1 | 4190 | 33.97 | <.0001 |
zCeldt | 1 | 4190 | 171.07 | <.0001 |
Female*ProFnc | 1 | 4190 | 0.10 | 0.7482 |
PrEdClGrd*ProFnc | 1 | 4190 | 0.14 | 0.7094 |
PrEdSmClHS*ProFnc | 1 | 4190 | 1.74 | 0.1877 |
zCeldt*ProFnc | 1 | 4190 | 0.89 | 0.3453 |
Ah, a much easier question. Because your errors are gaussian, I believe that the random effect estimates are good. Proceeding from the fixed estimates, differences between levels were constant between the two methods, so variance estimates should not be affected. I still advise against htype=1, due to likely unbalance, and the potential for order entry effects as a result. (Although somewhere, Frank Harrell is rolling his eyes in disgust at this advice).
Steve Denham
Without going into a lot of other interesting things, one way of putting the two procs "on the same" page would be the use of the ref= option in the class statement. Perhaps something like:
CLASS stdpseudoid schcode ProFnc(ref='1');
for both procs. This would put MIXED on the same reference as HPMIXED. Once you do this, I think you will be satisfied with all of the other things you are doing in moving the covariance parameters over to MIXED. It would be nice if the STORE statement was available for HPMIXED, as then all of the adjustments you are using MIXED for would be available directly for the HPMIXED output. Good luck with this project, and let us know if this helps at all.
Steve Denham
Thanks very much, Steve.
I must be missing something. ProFnc is a numeric variable (0 or 1). I tried making 0 the ref group, but received a syntax error. Please see below:
NOTE: The SAS System stopped processing this step because of errors.
PROC HPMIXED DATA=sub2pct_long noclprint;
CLASS stdpseudoid schcode ProFnc(ref=0);
-
22
76
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_,
_CHARACTER_, _CHAR_, _NUMERIC_.
ERROR 76-322: Syntax error, statement will be ignored.
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
ProFnc
zCeldt /solution htype=1;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
PARMS /PDATA=hpmcov2pct hold=1,2,3,4,5 noiter;
RUN;
The ref= syntax requires the level to be quoted, whether it is character or numeric in nature. Try
CLASS stdpseudoid schcode ProFnc(ref='0');
making sure to quote the zero.
Steve Denham
Still no luck. I got the same error...
PROC HPMIXED DATA=sub2pct_long noclprint;
CLASS stdpseudoid schcode ProFnc(ref='0');
-
22
76
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_,
_CHARACTER_, _CHAR_, _NUMERIC_.
ERROR 76-322: Syntax error, statement will be ignored.
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
ProFnc
zCeldt /solution ;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
ODS OUTPUT covparms=hpmcov2pct;
RUN;
Two possible workarounds:
In HPMIXED (this seems to have a real finicky taste for the ref= option):
class stdpseudoid schcode ProFnc(ref=LAST);
and leave MIXED unchanged, or
In MIXED:
class stdpseudoid schcode ProFnc(ref='0');
and leave HPMIXED unchanged, or at least without the ref= option.
I noticed something else--the fixed effects are not listed in exactly the same order in the two PROCS. I am just curious as to whether that may be affecting things, as might the htype=1 hypothesis tests. Given the likely unbalanced nature and the order dependence of Type 1 hypotheses, this may also be something to be concerned about.
Steve Denham
.
Still no luck.
First, I tried the following:
PROC HPMIXED DATA=sub2pct_long noclprint;
CLASS stdpseudoid schcode ProFnc (ref=LAST);
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
ProFnc
zCeldt /solution ;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
ODS OUTPUT covparms=hpmcov2pct;
RUN;
But received the same error message for HPMIXED:
CLASS stdpseudoid schcode ProFnc (ref=LAST);
-
22
76
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_,
_CHARACTER_, _CHAR_, _NUMERIC_.
ERROR 76-322: Syntax error, statement will be ignored.
Then, I tried the following:
PROC HPMIXED DATA=sub2pct_long noclprint;
CLASS stdpseudoid schcode ProFnc;
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
ProFnc
zCeldt /solution ;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
ODS OUTPUT covparms=hpmcov2pct;
RUN;
PROC MIXED DATA=sub2pct_long noclprint covtest lognote method=reml;
CLASS stdpseudoid schcode ProFnc (ref='0');
MODEL zELA = Timec|Timec|Timec|Timec
Female
PrEdClGrd
PrEdSmClHS
FRL
ProFnc
zCeldt /solution htype=1;
RANDOM intercept Timec /subject=stdpseudoid type=un;
RANDOM intercept /subject=schcode type=un;
PARMS /PDATA=hpmcov2pct hold=1,2,3,4,5 noiter;
RUN;
But I received the same error message for MIXED:
CLASS stdpseudoid schcode ProFnc (ref='0');
-
22
76
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_,
_CHARACTER_, _CHAR_, _NUMERIC_.
ERROR 76-322: Syntax error, statement will be ignored.
This is very curious and unexpected behavior. I have used the ref= option in the class statement of both procedures and have not run into this error, other than when the value was unquoted. FIRST and LAST always seem to work Since the other class variables are used only as subjects in the random statements, perhaps the global version would work:
CLASS stdpseudoid schcode ProFnc /ref=FIRST;
Steve Denham
Thanks, Steve. I tried the global version and received the same error message. I think the reason is I have version 9.3 TS1M0, not TS1M2 (as specified below):
" NOTE : The REF= option for setting reference levels was added to the GLM, MIXED, GLIMMIX, and ORTHOREG beginning in SAS 9.3 TS1M2."
Can I trust the random effects from HP even where the fixed effects seem weird (i.e. even when it is estimating parameters for both categories of a dichotomous CLASS variable) ?
Ah, a much easier question. Because your errors are gaussian, I believe that the random effect estimates are good. Proceeding from the fixed estimates, differences between levels were constant between the two methods, so variance estimates should not be affected. I still advise against htype=1, due to likely unbalance, and the potential for order entry effects as a result. (Although somewhere, Frank Harrell is rolling his eyes in disgust at this advice).
Steve Denham
Thanks very much for your help. Much appreciated.
Hi Steve,
I have a follow-up question I was hoping you might be able to answer...
Is it possible to get MIXED to generate hypothesis tests for covariance parameters even when it's prevented from iterating on them when given values from HPMIXED? It seems to go back to solving for the random effects, which is what takes all the time and memory, and what I'm trying to avoid by using HPMIXED and MIXED.
thank you.
What sort of tests do you have in mind--being cognizant of the distributional assumptions regarding estimates of variance, etc? Once upon a time, MIXED automatically "tested" whether the individual variance estimates were different from zero as part of the default output, using an asymptotic normal assumption. Those tests were, well, wrong. Sometimes Wald tests just do not apply.
Anyway, PROC GLIMMIX enables some specific kinds of G side testing. But I would avoid tests if at all possible and consider instead the estimation approach, looking at confidence bounds on the parameters. Those should be available from HPMIXED. When you fix/hold/noiter variance-covariances, I don't think there will be any way to calculate bounds or tests on the parameters. The blup's, on the other hand, can be compared, using the ESTIMATE statement.
Steve Denham
Well I was thinking about testing whether individual variance estimates were different from zero, using covtest. But if I were to look at the confidence bounds, I would do so with ' cl ' after the slash in the random statement? And to prevent SAS from outputting confidence intervals for every student?
RANDOM intercept Timec /subject=stdpseudoid type=un cl;
RANDOM intercept /subject=schcode type=un;
One last question if I could... Do you have any thoughts on whether it's useful/appropriate to include school-level covariates in the model even if a very small percentage (definitely less than 5%, and maybe closer to 2%) of variance is between schools? An argument in favor of including school-level covariates is that the composition of schools could still be affecting individual student performance, even if the average performance across schools doesn't vary that much.
So the code with cl in it should only give three estimates (intercept, timec and their covariance) with confidence bounds. Only if you specify solution, or do the blups, would I expect to get a confidence bound for each student.
I always try to include design covariates, and the school-level covariates strike me as design based in the following sense: If you set up a designed experiment to look at these effects, you would include them. In this case, not including them would probably increase the standard errors and reduce the precision of your estimates.
Steve Denham
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.