BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Melk
Lapis Lazuli | Level 10

I am reading mixed things about whether it is appropriate to use a stepwise selection for a prediction ordered logistic regression model. Does anyone have any input on this they would be willing to share?

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Better, more modern selection methods include LASSO, least angle regression (LAR), and elastic net. These methods are available in various SAS procedures as mentioned in this list of frequently asked-for statistics. For logistic models, the LASSO method is available in PROC HPGENSELECT. For more information see this note and the link there to an article by Gunes.

View solution in original post

5 REPLIES 5
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

The statistical literature is not mixed regarding the appropriateness of stepwise methods: the consensus (over literally decades of study) is, Don't use them. You can review the literature for reasons; the primary disadvantage is inflated Type I error, but there are other disadvantages as well.

 

My sense is that practitioners (in other words, non-statisticians) promote stepwise because it is SO EASY, you hardly have to give any thought to it at all. But easy is not the same thing as good or appropriate.

PaigeMiller
Diamond | Level 26

@sld wrote:

 

My sense is that practitioners (in other words, non-statisticians) promote stepwise because it is SO EASY, you hardly have to give any thought to it at all. But easy is not the same thing as good or appropriate.


In fact, if you take a class from SAS Institute about logistic regression, you hear this. The specific instructor that I heard made a point to mention that stepwise has all these drawbacks, but I was left with the impression that the instructor was advising the class to go ahead and use it anyway.

 

I believe a better solution is Logistic Partial Least Squares regression. Partial Least Squares is usually much more effective in the case of collinearity, and is available in SAS for continuous responses, but there is no logistic version available in SAS. You could probably program your own version of Logistic PLS using this paper: https://cedric.cnam.fr/fichiers/RC906.pdf

--
Paige Miller
StatDave
SAS Super FREQ

Better, more modern selection methods include LASSO, least angle regression (LAR), and elastic net. These methods are available in various SAS procedures as mentioned in this list of frequently asked-for statistics. For logistic models, the LASSO method is available in PROC HPGENSELECT. For more information see this note and the link there to an article by Gunes.

Melk
Lapis Lazuli | Level 10

Is LASSO appropriate for only really large datasets?

StatDave
SAS Super FREQ

No, as shown in the note I referred to, it just adds a penalty to the log likelihood to be maximized.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1435 views
  • 2 likes
  • 4 in conversation