Hello Experts:
I run into a difficulty to explain which independent variable is more predictive when running a Proc Logistic.
In my proc Logistic procedure, I modeled the dependent variable (Event = '1'), see code below:
proc logistic data=data_test;
model p_flag (event = '1') = var1 var2 var3 / selection=stepwise;
run;
Then I get this results for the stepwise selection:
Summary of Stepwise Selection | ||||||||
Step | Effect | DF | Number In | Score Chi-Square | Wald Chi-Square | Pr > ChiSq | Variable Label | |
Entered | Removed | |||||||
1 | var3 | 1 | 1 | 2006.955 | <.0001 | |||
2 | var2 | 1 | 2 | 150.3232 | <.0001 | |||
3 | var1 | 1 | 3 | 43.6837 | <.0001 |
and the likelihood:
Analysis of Maximum Likelihood Estimates | |||||
Parameter | DF | Estimate | Standard Error | Wald Chi-Square | Pr > ChiSq |
Intercept | 1 | -4.1428 | 0.1142 | 1316.869 | <.0001 |
var1 | 1 | 1.083 | 0.1556 | 48.4503 | <.0001 |
var2 | 1 | -0.1256 | 0.00706 | 316.7223 | <.0001 |
var3 | 1 | 0.2754 | 0.00463 | 3541.0916 | <.0001 |
And the odd ratios:
Odds Ratio Estimates | |||
Effect | Point Estimate | 95% Wald Confidence Limits | |
var1 | 2.954 | 2.177 | 4.007 |
var2 | 0.882 | 0.87 | 0.894 |
var3 | 1.317 | 1.305 | 1.329 |
So my question is:
Which one of the 3 variables is the most predictable for p_flag =1? In the likelihood test, which output field I need to look into: estimate vs. Wald Chi-square?
Thanks a lot!
LHK
I don't think there is an answer to your question. I don't really even understand what you want when you say "most predictable". It doesn't make sense to me to just use one predictor given these results.
I will pick up Wald Chi-Square, since it is statistic .
And var3 has the biggest Chi-Square ,so its Pr > ChiSq should be smallest. therefore, var3 is most significant variable.
@Ksharp wrote:
I will pick up Wald Chi-Square, since it is statistic .
And var3 has the biggest Chi-Square ,so its Pr > ChiSq should be smallest. therefore, var3 is most significant variable.
True, but the question was not "most significant variable", it was "most predictable", which seems to me is a meaningless question, and different than "most significant".
Anyway, the idea of selecting one of these three variables is the "best" seems like a very poor idea.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.