Re: Zero estimates for solution for random effects in proc mixed

MOA · Posted 08-22-2017 01:01 PM

I'm conducting a MLM analysis using proc mixed. The level 1 variables are dummy coded variables for grade comparisons (Dummy_FS=freshmen vs seniors; Dummy_SS= sophomores vs seniors; Dummy_JS = juniors versus seniors). Note that some institutions do not have freshmen, others do not have sophomores and others do not have juniors.

I requested the solutions for random effects and for some institutions I got zero estimates for some of the dummy variables. Those zero estimates correspond to those institutions that do not have information available to compute those estimates. For example, looking at the attached table, Institution 372 do not have juniors nor sophomores, hence the zero estimates in Dummy_FS and Dummy_SS.

My question is: what does the zero estimate mean? Specifically, I'd like to know if those institutions with no information in specific comparisons, are being considered in the overall estimation of the random slopes or if they are not being considered in the estimation. How should I interpret the zero estimates and the standard error of prediction for those institutions?

sld · Posted 08-22-2017 01:18 PM

It would be helpful to see your code and your dataset (or if your data cannot be shared, a dataset that closely resembles yours and provides similar results).

Why are you concocting your own dummy variables for the grade factor rather than using the CLASS statement?

PaigeMiller · Posted 08-22-2017 02:12 PM

@MOA wrote:

Specifically, I'd like to know if those institutions with no information in specific comparisons, are being considered in the overall estimation of the random slopes or if they are not being considered in the estimation.

Random slopes of what? You haven't mentioned the model you are fitting or any other variables that are being used.

As the other respondent said, we really need to see your code.

--
Paige Miller

MOA · Posted 08-22-2017 02:25 PM

I'm computing a MLM model with the three dummy variables as level 1 variables. I'm computing the fixed effects and random slopes for the dummy variables. Here's the code:

proc mixed data= MLM_grade method = ml covtest ;
class inst_ID ;
model dv = Dummy_FS Dummy_SS Dummy_JS / solution;
random intercept Dummy_FS Dummy_SS Dummy_JS / subject = inst_ID s;
run;

I used dummy variables instead of the class statement since I am comparing the results across different programs (Mplus, SPSS, HLM). Attached is the data.

sld · Posted 08-22-2017 04:28 PM

I'll assume it is your intent to compare the mean DV across four grades, or to compare the first 3 means (freshman, sophomore, junior) to the fourth (senior).

Grade is categorical. Although it is true that ANOVA is equivalent to regression on a set of dummy variables, I don't think random slopes for dummy variables make any sense.

If you want to compare estimates of mean DV by grade across software packages, it does not matter how grade is coded; any coding system will give you the same means. If you want to compare parameter estimates, then you will need to use similar coding systems. By default, the MIXED procedure uses the last level as the reference level, but you can control that using the REF= option on the CLASS statement.

Assuming that multiple observations (students?) for each level of grade within the same inst_ID are subsamples and are not independent, I suggest this code:

proc mixed data= MLM_grade method = ml covtest ;
    class inst_ID grade;
    model dv = grade / solution;
    random inst_id inst_id*grade / solution;
    /* pair-wise comparisons among grades */
    lsmeans grade / pdiff adjust=simulate(seed=123) ;
    /* comparison of each grade to control */
    lsmeans grade / adjust=dunnett pdiff=control('4');
    run;

An equivalent syntax for the RANDOM statement is

    random intercept grade / subject=inst_id solution;

Either RANDOM statement produces an estimate of variance among inst_id, an estimate of variance among groups of students assigned to the same grade within inst_id, and and estimate of variance among students within groups (residual).

This model assumes that data are missing at random, but in fact you are missing appreciably more data on freshman and sophomore grade levels than on junior and senior. The number of students is extremely unbalanced within each inst_id and within groups within inst_id. These issues may make this model inappropriate.

Zero estimates for solution for random effects in proc mixed