BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KR123
Fluorite | Level 6

Hello,
I am trying to combine estimates for a multinomial logistic model with imputed data. I cannot figure out how to do this when I run the model with proc surveylogistic . I would like to have the estimate and the odds ratio. I've attached part of the output I get and you can see the problem is I'm not getting full output from mianalzye. 

This is my non-working code:

 

proc surveylogistic data=may order=formatted;

strata sch_id;

weight bystuwt;

class educexp (ref="3") baserace (ref="4");

model educexp=baserace bmathse /link=glogit covb;

domain _imputation_;

ods output ParameterEstimates=lgsparms (where=(_imputation_ ne . ));

run;

 

proc mianalyze parms(classvar=classval)=lgsparms ;

class educexp baserace;

modeleffects Intercept baserace bmathse;

run;

 

Any help would be greatly appreciated! 

1 ACCEPTED SOLUTION

Accepted Solutions
SAS_Rob
SAS Employee

You should use the BY statement in SURVEYLOGISTIC and not the DOMAIN statement when you have multiply imputed data.  The DOMAIN statement would only apply when the subgroup sample sizes are random variables (i.e. not part of the sample design itself).

 

As far as using MIANALYZE for a generalized logit model, you should be able to follow the example below.

 

/*Sample data set that assumes the imputation has already been done*/
data test;
seed=2534565;
do _imputation_=1 to 2;
do subj=1 to 60;
do a=1 to 3;
do rep=1 to 3;
ind1=ranuni(seed)*subj;
int=-1+rannor(31221)*_imputation_;
logit=int + .05*ind1-.67*a;
p=exp(-logit)/(1+exp(-logit));
if ranuni(seed)>p then y=1;
else if ranuni(314)>.5 then y=2;
else y=3;
output;
end;
end;
end;
end;
run;ods trace on;
proc surveylogistic data=test;
class y a;
model y=ind1 a/link=glogit;
by _Imputation_;
ods output parameterestimates=parms;
run;

/*Need to sort by the different levels of the response variable so that MIANALYZE will*/
/*give the output for each logit function*/
proc sort data=parms;
by response _imputation_;
run;

proc mianalyze parms(classvar=classval)=parms;
class a;
modeleffects intercept a ind1;
by response;
run;

 

 

 

 

View solution in original post

10 REPLIES 10
gcjfernandez
SAS Employee

To impute missing values in Survey analysis, use PROC SURVEYIMPUTE and create imputed JK weights.

Then use these imputed JK replicate weights with PROC Surveylogistic to fit the generalized survey logistic model.

Please refer this paper: https://support.sas.com/resources/papers/proceedings16/SAS3520-2016.pdf

KR123
Fluorite | Level 6

Thank you for this suggestion. However, the data have already been multiply imputed using suggestions by Berglund and Heeringa(2014). I found this paper https://support.sas.com/resources/papers/proceedings15/3320-2015.pdf that uses proc surveylogistic and proc mianalyze, but with a dichotomized dependent variable. I am trying to understand how to combine estimates from a multinomial regression. 

SAS_Rob
SAS Employee

You should use the BY statement in SURVEYLOGISTIC and not the DOMAIN statement when you have multiply imputed data.  The DOMAIN statement would only apply when the subgroup sample sizes are random variables (i.e. not part of the sample design itself).

 

As far as using MIANALYZE for a generalized logit model, you should be able to follow the example below.

 

/*Sample data set that assumes the imputation has already been done*/
data test;
seed=2534565;
do _imputation_=1 to 2;
do subj=1 to 60;
do a=1 to 3;
do rep=1 to 3;
ind1=ranuni(seed)*subj;
int=-1+rannor(31221)*_imputation_;
logit=int + .05*ind1-.67*a;
p=exp(-logit)/(1+exp(-logit));
if ranuni(seed)>p then y=1;
else if ranuni(314)>.5 then y=2;
else y=3;
output;
end;
end;
end;
end;
run;ods trace on;
proc surveylogistic data=test;
class y a;
model y=ind1 a/link=glogit;
by _Imputation_;
ods output parameterestimates=parms;
run;

/*Need to sort by the different levels of the response variable so that MIANALYZE will*/
/*give the output for each logit function*/
proc sort data=parms;
by response _imputation_;
run;

proc mianalyze parms(classvar=classval)=parms;
class a;
modeleffects intercept a ind1;
by response;
run;

 

 

 

 

KR123
Fluorite | Level 6

Thank you @SAS_Rob , that worked! Do you happen to know how to get pooled odds ratio in this scenario?

ChuksManuel
Pyrite | Level 9

Hello Rob,

I saw you gave the response to this post just a few months ago. I've been struggling with how to output the OR of interaction term from the pooled multiple imputation estimates.

For example, lets assume that B is a categorical variables and Ind1 is a categorical variable with three levels. If there's an interaction between  B and Ind1 and i want to find the OR of the interaction, taking the exponential of b*Ind1 as in the dataset does not give me the OR of the interaction at different level of the exposure. Do you have an idea how to get OR from a pooled imputed estimates in which the model has an interaction term? 

proc mianalyze parms(classvar=classval)=parms;
class a b Ind1;
modeleffects intercept a b ind1 b*Ind1;
by response;
run;

 

SAS_Rob
SAS Employee

There a couple of ways you could do this, but I think the easiest way would be to use the LSMEANS statement and combine the differences.  You could then exponentiate those differences to get the Odds Ratios.  Below is what I have in mind.

/*Assume that the imputation has already been done*/
data test;
seed=2534565;
do _imputation_=1 to 5;
do a=1 to 3;
do b=1 to 2;
do i=1 to 250;
x1=ranuni(21);
logit=-2 + .05*a+.45*b+.88*a*b;
p=exp(-logit)/(1+exp(-logit));
if ranuni(seed)>p then y=1; else y=0;
output;
end;
end;
end;
end;
run;
proc surveylogistic data=test;
by _imputation_;
class a/param=glm;
model y=a|x1;
lsmeans a/at x1=.2 diff;
lsmeans a/at x1=.4 diff;
ods output diffs=diff_ds;
run;
data diff2;
set diff_ds;
comparison=effect||'='||trim(left(a))||' vs '||left(effect)||'='||left(_a);
run;

proc sort data=diff2;
by comparison _imputation_;
run;


proc mianalyze data=diff2;
by comparison;
modeleffects estimate;
stderr stderr;
ods output parameterestimates=mianalyze_parms;
run;

data OR;
set mianalyze_parms;
OR=exp(estimate);
LCL_OR=exp(LCLMean);
UCL_OR=exp(UCLMean);
proc print;
var comparison OR LCL_OR UCL_OR;
run;

 

ChuksManuel
Pyrite | Level 9

Hello,

Thanks. This part if the code is a little confusing

proc surveylogistic data=test;
by _imputation_;
class a/param=glm;
model y=a|x1;
lsmeans a/at x1=.2 diff;
lsmeans a/at x1=.4 diff;

 

I assume you got x1= 0.2 from intercept and x1 =0.4 from the coefficient of a-b.

Since my predictors are all categorical, why can't i do something like this:

lsmeans a/at x1=0 diff; to denote the first level of my cat variables 

lsmeans a/at x1=1 diff; to denote the second level of my cat variables.

 

ChuksManuel
Pyrite | Level 9

Hello Rob, I outputted the interaction effects of the 50 imputed variables in a dataset

Please how do i get the pooled estimate of these estimates.

Here's the snip of my code. Please i need help or is there a contact of anyone in SAS who can help? I can't find a contact on the sas.com website. The SAS document on this topic did not release how to implement using proc mianalyse to pool estimates of interaction term.

proc surveylogistic data =newnsch3c NAMELEN=100;
BY _Imputation_;
class    smokes(Ref= '0')   composite (ref='0') /param=glm ;
strata Fipsst;
cluster hhid;
weight fwc;
Model asthma (event ='1') =  smokes composite smokes*composite ;
lsmeans smokes*composite/ diff;
ods output diffs=diff_ds;
run;

smokes has 2 levels (0 and 1) while composite has (0,1,2,3).The above code gave me a dataset with a structure like this. How do i pool these effects together using proc mianalyze.

 
 

Inter.JPG

 

 

 
 
 

 

 

SAS_Rob
SAS Employee

You have to set up the comparison variable the same way as in the case of an interaction with a continuous variable except you will need to cover both variables.  Here is a simple example with random data.

 

data outmi;
do _imputation_=1 to 3;
do trt='test','trt1','trt2';
do b=1 to 3;
do rep=1 to 110;
trtn+1;
if ranuni(4123)>.5 then y=1; else y=0;
output;
end;
end;
end;
end;
run;

/*Assuming the imputation has already been done*/
proc surveylogistic data=outmi;
by _imputation_;
class trt b/param=glm;
model y=trt|b;
lsmeans trt|b/diff;
ods output diffs=lsmeans_diffs;
run;
data diff2;
set lsmeans_diffs;
comparison=trt||''||trim(left(b))||' vs '||left(_trt)||''||left(_b);
run;
proc sort data=diff2;
by comparison _imputation_;
run;

proc mianalyze data=diff2;
by comparison;
modeleffects estimate;
stderr stderr;
ods output ParameterEstimates=MI_parms;
run;
data OR;
set MI_parms;
OR=exp(estimate);
lower_or=exp(LCLMean);
upper_or=exp(UCLMean);
keep comparison OR lower_or upper_or;
run;

proc print;
run;

 

ChuksManuel
Pyrite | Level 9
Thank you so much. This worked and was shorter.
I solved the problem by creating a dataset containing smoke 1 vs 0, and then creating 4 dataset from that have the parameter estimates of each level of my composite variable. I then used proc mianalyze on each of the 4 datasets and each dataset gave me the OR. Longer but it did it.
Thanks so much!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 2717 views
  • 3 likes
  • 4 in conversation