BookmarkSubscribeRSS Feed
eajanio
Calcite | Level 5

Hello,

I ran a logistic regression model assessing the interaction between distance (a continuous variable) and SES (a categorical variable with two levels). I then wanted to calculate the predicted probabilities of the outcome for each level of the interaction. However, I understand that this cannot be done with a continuous variable in the interaction term. Thus, I used the lsmeans command to evaluate the predicted probability of each level of the interaction term at specific values of distance. My question is regarding the interpretation of the output. My interaction term was significant in my original logistic regression model such that the interaction between distance and the second level of SES were significantly associated with the outcome when the interaction between distance and the first level was the reference. However, once I calculated the predicted probabilities, these values were not significantly different from one another. Does this mean that overall, as distance changes, there is an association with the outcome among those in the second SES group, but that at these specific values of distance, there is no difference between the two groups?

Thank you.

 

proc logistic data="H:\desktop\DataAnalysis" descending;

class  SES (ref='1') /param=ref;

model stage = SES | distance ;

where sample = 1; 

run;

 

proc logistic data = "H:\desktop\DataAnalysis" descending;

class  SES (ref='1')  /param=glm;

model stage = SES | distance ;

lsmeans SES/ at distance=2.67 ilink or cl diff;

lsmeans SES/ at distance=7.60 ilink or cl diff;

lsmeans SES/ at distance=24.51 ilink or cl diff;

where sample = 1; 

run;

4 REPLIES 4
PaigeMiller
Diamond | Level 26

I then wanted to calculate the predicted probabilities of the outcome for each level of the interaction. However, I understand that this cannot be done with a continuous variable in the interaction term.

 

This is not correct. Predicted values can be created from any model. If you are using PROC LOGISTIC, you can do this using the OUTPUT statement, and several other ways. See:

https://blogs.sas.com/content/iml/2014/02/17/the-missing-value-trick-for-scoring-a-regression-model....

https://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html

 

You will have to specify values of the continuous variable to predict at, as you tried to do using LSMEANS. You could also use the EFFECTPLOT statement to draw a picture of the interaction. To answer your question about the LSMEANS: in this case, the LSMEANS would be on the lines shown in the plot of the Interaction.

--
Paige Miller
eajanio
Calcite | Level 5

Thank you for your reply. Would the predicted probabilities produced here be different than those I've obtained with my code?

PaigeMiller
Diamond | Level 26

Please provide more information. Please explain what you did. Please provide code. Please provide output.

 

The predicted values that SAS produces, I assume, are correct. If you are obtaining different predicted values, I would assume you have made a mistake somewhere.

--
Paige Miller
StatDave
SAS Super FREQ

The significant interaction tells you that the difference in the SES groups changes depending on the distance, or equivalently that the effect of distance depends SES - think of the slope of the two distance curves for the SES levels at each of various values of distance. But note that you are talking about the SES difference or the distance effect in terms of the log odds, not the event probability. This is discussed in more detail in this note. See, in particular, the Poisson model example on rates and the use of the Margins macro at the end which compares the event probabilities at selected values of the continuous predictor (as well as compares slopes). A similar approach can be used for your logistic model. Note also the use of the EFFECTPLOT statement to plot the predicted log odds or predicted probabilities. 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 542 views
  • 1 like
  • 3 in conversation