Re: how to run multiple pairwise testing for categorical variables in ...

mj3 · Posted 06-10-2017 12:27 PM

Hi,

I am trying to run multiple comparison testing to see whether the value of a categorical variable education (4 levels) differs significantly depending on which one of 5 groups a person is in. This is weighted survey data so I am trying to test these multiple comparisons by running a multinomial logistic regression using education level as the dependent variable and using the 5 group variable as the independent variable, then using lsmeans to check for significant pairwise differences using Tukey procedure.

My syntax is below:

PROC SURVEYLOGISTIC DATA=data1;
weight weight;
CLASS EDUCATION_LEVEL GROUP / PARAM=GLM;
MODEL EDUCATION_LEVEL = GROUP / LINK=GLOGIT;
LSMEANS GROUP / ADJUST=TUKEY PDIFF=ALL;
RUN;

However, the differences table in my output doesn't show the differences between groups for the reference category of education. How do I change the code to show whether the reference group of education varies by the 5 groups?
Should I be using LSESTIMATE and if so, how do I specify that? I'm using SAS version 9.4

Thanks in advance!

StatDave · Posted 07-13-2017 10:05 AM

For a four-level response, only three independent response functions (logits) can be simultaneously modeled - each using a ratio of the probabilities of one response level over the reference level. As a result, you get LS-mean estimates and pairwise comparisons among the predictor levels for each of those three response functions. The easiest way to get the estimates and comparisons for a logit focused on the reference response level is to change the reference level of the response. To do that, use the REF= option following the response variable name in the MODEL statement and specify (in quotes) a value of the response other than the reference level used in your initial analysis.

For example, the following uses the data in the example titled "Nominal Response Data: Generalized Logits Model" in the LOGISTIC documentation. The response, Style, has three levels - self, team, and class. The first step below fits the model to logits for styles self and team using class as the reference in both logits. The probability estimates for each school on the self and team styles are provided by the LSMEANS / ILINK statement and appear in the Mean column in the table of Least Squares Means. The estimates for the class style are provided by the second step which changes the reference level of the response to self instead of class. This code uses PROC LOGISTIC, but the same should work with SURVEYLOGISTIC.

proc logistic data=school;

freq Count;

class School/param=glm;

model Style(order=data)=School / link=glogit;

lsmeans school/diff ilink;

run;

proc logistic data=school;

freq Count;

class School/param=glm;

model Style(order=data ref="self")=School / link=glogit;

lsmeans school/diff ilink;

run;

how to run multiple pairwise testing for categorical variables in proc surveylogistic

Re: how to run multiple pairwise testing for categorical variables in proc surveylogistic