Solved: Re: PROC GLIMMIX: Different p-values for different reference groups?

sasrules · Posted 04-12-2016 12:14 AM

Hello All,

I'm running a multinomial logistic model with a random effect using PROC GLIMMIX. I've noticed that I get different p-values for my independent variable depending on which reference group I use. This seems strange to me and doesn't happen if I remove the random effect. I am wondering if I might be doing something incorrectly with the random effect. In particular, I don't understand why I have to add the "group" option for the random effect, but SAS seems to require that I do so! Please see my code below. Thank you!

proc glimmix data=lib.submixed;
class author placement;
model placement(ref='Middle')=year/dist=mult link=glogit; /*p=0.0026 for year*/
random int / subject=author group=placement;
run;

proc glimmix data=lib.submixed;
class author placement;
model placement(ref='End')=year/dist=mult link=glogit; /*p=0.0089 for year*/
random int / subject=author group=placement;
run;

lvm · Posted 04-16-2016 04:33 PM

Your results actually make sense, but may not be intuitive. With generalized logits and K levels to the response variable, one is actualy fitting K-1 linear predictors (one for each non-reference level of Y). So, the models being fitted depend on the one not being fitted (for the reference level, which is essentially assigned a linear predictor of 0). The overall relationship to X for levels 1 and 2 (for instance) can be different from the relationship to X for levels 2 and 3, and so on. See the excert below from the User's Guide, especially the part I put in bold. Likewise the random effects can be different for each of the K-1 linear predictors; this is why one is forced to have separate random effects (group= option).

In generalized logit models (for multinomial data with unordered categories), one response category is chosen as the reference category in the formulation of the generalized logits. By default, the linear predictor in the reference category is set to 0, and the reference category corresponds to the entry in the "Response Profile" table with the highest Ordered Value. You can affect the assignment of Ordered Values with the DESCENDING and ORDER= options in the MODEL statement. You can choose a different reference category with the REF= option. The choice of the reference category for generalized logit models affects the results. It is sometimes recommended that you choose the category with the highest frequency as the reference (see, for example, Brown and Prescott 1999, p. 160). You can achieve this with the GLIMMIX procedure by combining the ORDER= and REF= options, as in the following statements:

proc glimmix;
   class preference;
   model preference(order=freq ref=first) = feature price /
                   dist=multinomial
                   link=glogit;
   random intercept / subject=store group=preference;
run;

The ORDER=FREQ option arranges the categories by descending frequency. The REF=FIRST option then selects the response category with the lowest Ordered Value—the most frequent category—as the reference.

View solution in original post

lvm · Posted 04-16-2016 04:33 PM

Your results actually make sense, but may not be intuitive. With generalized logits and K levels to the response variable, one is actualy fitting K-1 linear predictors (one for each non-reference level of Y). So, the models being fitted depend on the one not being fitted (for the reference level, which is essentially assigned a linear predictor of 0). The overall relationship to X for levels 1 and 2 (for instance) can be different from the relationship to X for levels 2 and 3, and so on. See the excert below from the User's Guide, especially the part I put in bold. Likewise the random effects can be different for each of the K-1 linear predictors; this is why one is forced to have separate random effects (group= option).

In generalized logit models (for multinomial data with unordered categories), one response category is chosen as the reference category in the formulation of the generalized logits. By default, the linear predictor in the reference category is set to 0, and the reference category corresponds to the entry in the "Response Profile" table with the highest Ordered Value. You can affect the assignment of Ordered Values with the DESCENDING and ORDER= options in the MODEL statement. You can choose a different reference category with the REF= option. The choice of the reference category for generalized logit models affects the results. It is sometimes recommended that you choose the category with the highest frequency as the reference (see, for example, Brown and Prescott 1999, p. 160). You can achieve this with the GLIMMIX procedure by combining the ORDER= and REF= options, as in the following statements:

proc glimmix;
   class preference;
   model preference(order=freq ref=first) = feature price /
                   dist=multinomial
                   link=glogit;
   random intercept / subject=store group=preference;
run;

The ORDER=FREQ option arranges the categories by descending frequency. The REF=FIRST option then selects the response category with the lowest Ordered Value—the most frequent category—as the reference.

sasrules · Posted 04-17-2016 01:40 PM

Thank you!