BookmarkSubscribeRSS Feed
LHK
Calcite | Level 5 LHK
Calcite | Level 5

Hello Experts:

I run into a difficulty to explain which independent variable is more predictive when running a Proc Logistic.

In my proc Logistic procedure, I modeled the dependent variable (Event = '1'), see code below:

proc logistic data=data_test;

model p_flag (event = '1') = var1 var2 var3 / selection=stepwise;

run;

Then I get this results for the stepwise selection:

Summary of Stepwise Selection
StepEffectDFNumber InScore Chi-SquareWald Chi-SquarePr > ChiSqVariable Label
EnteredRemoved
1var3 112006.955 <.0001 
2var2 12150.3232 <.0001 
3var1 1343.6837 <.0001 

and the likelihood:

Analysis of Maximum Likelihood Estimates
ParameterDFEstimateStandard ErrorWald Chi-SquarePr > ChiSq
Intercept1-4.14280.11421316.869<.0001
var111.0830.155648.4503<.0001
var21-0.12560.00706316.7223<.0001
var310.27540.004633541.0916<.0001

And the odd ratios:

Odds Ratio Estimates
EffectPoint Estimate95% Wald  Confidence Limits
var12.9542.1774.007
var20.8820.870.894
var31.3171.3051.329

 

So my question is:

Which one of the 3 variables is the most predictable for p_flag =1? In the likelihood test, which output field I need to look into: estimate vs. Wald Chi-square?

 

Thanks a lot!

LHK

3 REPLIES 3
PaigeMiller
Diamond | Level 26

I don't think there is an answer to your question. I don't really even understand what you want when you say "most predictable". It doesn't make sense to me to just use one predictor given these results.

--
Paige Miller
Ksharp
Super User

I will pick up Wald Chi-Square, since it is statistic .

And var3 has the biggest Chi-Square ,so its Pr > ChiSq should be smallest. therefore, var3 is most significant variable.

PaigeMiller
Diamond | Level 26

@Ksharp wrote:

I will pick up Wald Chi-Square, since it is statistic .

And var3 has the biggest Chi-Square ,so its Pr > ChiSq should be smallest. therefore, var3 is most significant variable.


True, but the question was not "most significant variable", it was "most predictable", which seems to me is a meaningless question, and different than "most significant".

 

Anyway, the idea of selecting one of these three variables is the "best" seems like a very poor idea.

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1417 views
  • 0 likes
  • 3 in conversation