Hello,
I am trying to combine estimates for a multinomial logistic model with imputed data. I cannot figure out how to do this when I run the model with proc surveylogistic . I would like to have the estimate and the odds ratio. I've attached part of the output I get and you can see the problem is I'm not getting full output from mianalzye.
This is my non-working code:
proc surveylogistic data=may order=formatted;
strata sch_id;
weight bystuwt;
class educexp (ref="3") baserace (ref="4");
model educexp=baserace bmathse /link=glogit covb;
domain _imputation_;
ods output ParameterEstimates=lgsparms (where=(_imputation_ ne . ));
run;
proc mianalyze parms(classvar=classval)=lgsparms ;
class educexp baserace;
modeleffects Intercept baserace bmathse;
run;
Any help would be greatly appreciated!
You should use the BY statement in SURVEYLOGISTIC and not the DOMAIN statement when you have multiply imputed data. The DOMAIN statement would only apply when the subgroup sample sizes are random variables (i.e. not part of the sample design itself).
As far as using MIANALYZE for a generalized logit model, you should be able to follow the example below.
/*Sample data set that assumes the imputation has already been done*/
data test;
seed=2534565;
do _imputation_=1 to 2;
do subj=1 to 60;
do a=1 to 3;
do rep=1 to 3;
ind1=ranuni(seed)*subj;
int=-1+rannor(31221)*_imputation_;
logit=int + .05*ind1-.67*a;
p=exp(-logit)/(1+exp(-logit));
if ranuni(seed)>p then y=1;
else if ranuni(314)>.5 then y=2;
else y=3;
output;
end;
end;
end;
end;
run;ods trace on;
proc surveylogistic data=test;
class y a;
model y=ind1 a/link=glogit;
by _Imputation_;
ods output parameterestimates=parms;
run;
/*Need to sort by the different levels of the response variable so that MIANALYZE will*/
/*give the output for each logit function*/
proc sort data=parms;
by response _imputation_;
run;
proc mianalyze parms(classvar=classval)=parms;
class a;
modeleffects intercept a ind1;
by response;
run;
To impute missing values in Survey analysis, use PROC SURVEYIMPUTE and create imputed JK weights.
Then use these imputed JK replicate weights with PROC Surveylogistic to fit the generalized survey logistic model.
Please refer this paper: https://support.sas.com/resources/papers/proceedings16/SAS3520-2016.pdf
Thank you for this suggestion. However, the data have already been multiply imputed using suggestions by Berglund and Heeringa(2014). I found this paper https://support.sas.com/resources/papers/proceedings15/3320-2015.pdf that uses proc surveylogistic and proc mianalyze, but with a dichotomized dependent variable. I am trying to understand how to combine estimates from a multinomial regression.
You should use the BY statement in SURVEYLOGISTIC and not the DOMAIN statement when you have multiply imputed data. The DOMAIN statement would only apply when the subgroup sample sizes are random variables (i.e. not part of the sample design itself).
As far as using MIANALYZE for a generalized logit model, you should be able to follow the example below.
/*Sample data set that assumes the imputation has already been done*/
data test;
seed=2534565;
do _imputation_=1 to 2;
do subj=1 to 60;
do a=1 to 3;
do rep=1 to 3;
ind1=ranuni(seed)*subj;
int=-1+rannor(31221)*_imputation_;
logit=int + .05*ind1-.67*a;
p=exp(-logit)/(1+exp(-logit));
if ranuni(seed)>p then y=1;
else if ranuni(314)>.5 then y=2;
else y=3;
output;
end;
end;
end;
end;
run;ods trace on;
proc surveylogistic data=test;
class y a;
model y=ind1 a/link=glogit;
by _Imputation_;
ods output parameterestimates=parms;
run;
/*Need to sort by the different levels of the response variable so that MIANALYZE will*/
/*give the output for each logit function*/
proc sort data=parms;
by response _imputation_;
run;
proc mianalyze parms(classvar=classval)=parms;
class a;
modeleffects intercept a ind1;
by response;
run;
Thank you @SAS_Rob , that worked! Do you happen to know how to get pooled odds ratio in this scenario?
Hello Rob,
I saw you gave the response to this post just a few months ago. I've been struggling with how to output the OR of interaction term from the pooled multiple imputation estimates.
For example, lets assume that B is a categorical variables and Ind1 is a categorical variable with three levels. If there's an interaction between B and Ind1 and i want to find the OR of the interaction, taking the exponential of b*Ind1 as in the dataset does not give me the OR of the interaction at different level of the exposure. Do you have an idea how to get OR from a pooled imputed estimates in which the model has an interaction term?
proc mianalyze parms(classvar=classval)=parms;
class a b Ind1;
modeleffects intercept a b ind1 b*Ind1;
by response;
run;
There a couple of ways you could do this, but I think the easiest way would be to use the LSMEANS statement and combine the differences. You could then exponentiate those differences to get the Odds Ratios. Below is what I have in mind.
/*Assume that the imputation has already been done*/
data test;
seed=2534565;
do _imputation_=1 to 5;
do a=1 to 3;
do b=1 to 2;
do i=1 to 250;
x1=ranuni(21);
logit=-2 + .05*a+.45*b+.88*a*b;
p=exp(-logit)/(1+exp(-logit));
if ranuni(seed)>p then y=1; else y=0;
output;
end;
end;
end;
end;
run;
proc surveylogistic data=test;
by _imputation_;
class a/param=glm;
model y=a|x1;
lsmeans a/at x1=.2 diff;
lsmeans a/at x1=.4 diff;
ods output diffs=diff_ds;
run;
data diff2;
set diff_ds;
comparison=effect||'='||trim(left(a))||' vs '||left(effect)||'='||left(_a);
run;
proc sort data=diff2;
by comparison _imputation_;
run;
proc mianalyze data=diff2;
by comparison;
modeleffects estimate;
stderr stderr;
ods output parameterestimates=mianalyze_parms;
run;
data OR;
set mianalyze_parms;
OR=exp(estimate);
LCL_OR=exp(LCLMean);
UCL_OR=exp(UCLMean);
proc print;
var comparison OR LCL_OR UCL_OR;
run;
Hello,
Thanks. This part if the code is a little confusing
proc surveylogistic data=test;
by _imputation_;
class a/param=glm;
model y=a|x1;
lsmeans a/at x1=.2 diff;
lsmeans a/at x1=.4 diff;
I assume you got x1= 0.2 from intercept and x1 =0.4 from the coefficient of a-b.
Since my predictors are all categorical, why can't i do something like this:
lsmeans a/at x1=0 diff; to denote the first level of my cat variables
lsmeans a/at x1=1 diff; to denote the second level of my cat variables.
Hello Rob, I outputted the interaction effects of the 50 imputed variables in a dataset
Please how do i get the pooled estimate of these estimates.
Here's the snip of my code. Please i need help or is there a contact of anyone in SAS who can help? I can't find a contact on the sas.com website. The SAS document on this topic did not release how to implement using proc mianalyse to pool estimates of interaction term.
proc surveylogistic data =newnsch3c NAMELEN=100;
BY _Imputation_;
class smokes(Ref= '0') composite (ref='0') /param=glm ;
strata Fipsst;
cluster hhid;
weight fwc;
Model asthma (event ='1') = smokes composite smokes*composite ;
lsmeans smokes*composite/ diff;
ods output diffs=diff_ds;
run;
smokes has 2 levels (0 and 1) while composite has (0,1,2,3).The above code gave me a dataset with a structure like this. How do i pool these effects together using proc mianalyze.
You have to set up the comparison variable the same way as in the case of an interaction with a continuous variable except you will need to cover both variables. Here is a simple example with random data.
data outmi;
do _imputation_=1 to 3;
do trt='test','trt1','trt2';
do b=1 to 3;
do rep=1 to 110;
trtn+1;
if ranuni(4123)>.5 then y=1; else y=0;
output;
end;
end;
end;
end;
run;
/*Assuming the imputation has already been done*/
proc surveylogistic data=outmi;
by _imputation_;
class trt b/param=glm;
model y=trt|b;
lsmeans trt|b/diff;
ods output diffs=lsmeans_diffs;
run;
data diff2;
set lsmeans_diffs;
comparison=trt||''||trim(left(b))||' vs '||left(_trt)||''||left(_b);
run;
proc sort data=diff2;
by comparison _imputation_;
run;
proc mianalyze data=diff2;
by comparison;
modeleffects estimate;
stderr stderr;
ods output ParameterEstimates=MI_parms;
run;
data OR;
set MI_parms;
OR=exp(estimate);
lower_or=exp(LCLMean);
upper_or=exp(UCLMean);
keep comparison OR lower_or upper_or;
run;
proc print;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.