BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ChuksManuel
Pyrite | Level 9

Hello statisticians,

I have been using proc mianalyse for sometime and i am unable to run a code with the procedure giving me errors.

It's basically telling me that an interaction term that i put into my model effects is not in the dataset...when it actually is in it.

 

ChuksManuel_0-1594632154990.png

This is a snip of the Lgsparm dataset and you can see the interaction term is there. Can anyone tell me why this is giving me this error? I have used this code many times and this is the first time this is happening.

ChuksManuel_1-1594632368610.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SAS_Rob
SAS Employee

Because of the length of the variable names, the naming of the interaction terms is truncated to 20 characters.  You'll notice that SmokeInside is missing the e in the naming of the interaction in the PARMS= data set.  The solution to the problem is to use the NAMELEN=100 option in the modeling procedure that creates the parameter estimates table, that is, the procedure you ran before Proc MIANALYZE.

View solution in original post

6 REPLIES 6
SAS_Rob
SAS Employee

Because of the length of the variable names, the naming of the interaction terms is truncated to 20 characters.  You'll notice that SmokeInside is missing the e in the naming of the interaction in the PARMS= data set.  The solution to the problem is to use the NAMELEN=100 option in the modeling procedure that creates the parameter estimates table, that is, the procedure you ran before Proc MIANALYZE.

ChuksManuel
Pyrite | Level 9

Thank you very much. That worked.

Then there's a question on how to output Odd Ratios of interaction terms that i have asked here 2 weeks ago and i never got a response about it. Here is my code:

proc surveylogistic data =newnsch3c NAMELEN=100;
class    sex (ref = '0') race(Ref = '0') FPL (ref= '0')   smokeinside(Ref= '0') 
family (ref='0')  composite (ref='0') age3_1718 (ref='2')  /param=glm ;
strata Fipsst;
cluster hhid;
weight fwc;
Model asthma (event ='1') = age3_1718 sex race FPL smokeinside composite family smokeinside*composite ;
lsmeans smokeinside*composite/oddsratio cl diff;
slice smokeinside*composite/ sliceby=composite diff=control('0' '0') oddsratio cl; 
ODS OUTPUT PARAMETERESTIMATES=lgsparms ODDSRATIO=lgsodds;
BY _Imputation_;
run;


PROC MIANALYZE PARMS(CLASSVAR=CLASSVAL)=lgsparms;
 CLASS age3_1718 sex race FPL smokeinside composite family ;
 MODELEFFECTS age3_1718 sex race FPL smokeinside composite family smokeinside*composite;
 ODS OUTPUT PARAMETERESTIMATES=mian_;
RUN;

- In the first code, i used the Odds Ratio statement but it did not output the odd ratios. How can i output the ORs. 

- The second code pooled all the 20 imputed parameters  (mian_). Now exponentiation of those parameters gives the OR for variables not involved in the interaction. However, for the interaction terms, smoking inside has 3 levels and composite has 4 level, a simple exponentiation will not give OR of smokeinside at each level of composite. How do i get the OR of this interaction term from the pooled parameter.

 

 

 

ChuksManuel
Pyrite | Level 9

Another question:

Do you also know why i'm getting this message?

ChuksManuel_0-1594661731525.png

 

SteveDenham
Jade | Level 19

It sure looks like the sorting is critical.  I would suggest adding a sort like;

 

proc sort data=newnsch3c;
by _imputation_ asthmsev_1718;
run;

however, I am a bit puzzled as to the nature of asthmsev_1718.  You specify a numeric reference category, which makes me think that this is a severity score, and that the variable is in fact ordinal rather than nominal.  That would make a big difference in how to handle this error, and whether the glogit is the proper link to be using. So if this is an asthma severity score for 17 and 18 year olds, you may wish to consider this 

 

model asthmsev_1718 (ref='1') = age3_1718 sex race FPL someinside composite family smokeinside^composite/link=logit;

or this

model asthmsev_1718 (order=INTERNAL) = age3_1718 sex race FPL someinside composite family smokeinside^composite/link=logit;

 

However, more discussion on this variable may make this either unnecessary or in need of additional work.

 

SteveDenham

 

ChuksManuel
Pyrite | Level 9

Hello,

Thank you. I sorted the dataset and that problem resolved.

My variable is asthma severity categorized as none, mild and moderate/severe. I guess that makes it ordinal in nature.

I tried the clogit  as  suggested online for ordinal logistic regression. However, i'm interested in looking at the odd ratio of mild  vs none and moderate/severe vs none outcomes and the ordinal logistic (both clogit and logit) did not give me that. Instead it gave me a single odds ratio (see the output below)

However when i used the multinomial (glogit) link function, it gave me different ORs of my predictors against the outcomes (mild vs none and moderate/severe vs none) which is what i'm interested in.

ChuksManuel_3-1594767753073.png

This output is the result of ordinal logistic. Are the ORs telling me that Mold is protective of asthma severity?

The multinomial logistic output below, for example, is telling me that mold had a stronger association in people with moderate/severe asthma than in people with mild asthma. 

ChuksManuel_2-1594767267322.png

 

 

SteveDenham
Jade | Level 19

You make a good case for the choice for use of the generalized logit, as you can get the result directly from the solution vector.  You could obtain the ORs you mention with a cumulative logit through the use of an LSMEANS statement with a diff option, or an LSMESTIMATE statement.  So that's behind us, I think.

 

Now on to the second panel.  The two ORs for Mold really look the same to me (1.403 vs 1.460) and the 95% interval for Mold in those with mild asthma completely contained within the 95% interval for those with moderate asthma.  Thus I would not say that there is a stronger association.

 

What is intriguing to me is that the OR in the cumulative logit case is the reciprocal of the weighted average of the ORs in the general logit (at least to rounding errors). I don't know if this is coincidental or if the choice of link influences the reference such that this should be expected.  I tried some algebra and had 4 equations with 8 unknowns, so I gave up.

 

SteveDenham

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1548 views
  • 2 likes
  • 3 in conversation