BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Melk
Lapis Lazuli | Level 10

I am reading mixed things about whether it is appropriate to use a stepwise selection for a prediction ordered logistic regression model. Does anyone have any input on this they would be willing to share?

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Better, more modern selection methods include LASSO, least angle regression (LAR), and elastic net. These methods are available in various SAS procedures as mentioned in this list of frequently asked-for statistics. For logistic models, the LASSO method is available in PROC HPGENSELECT. For more information see this note and the link there to an article by Gunes.

View solution in original post

5 REPLIES 5
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

The statistical literature is not mixed regarding the appropriateness of stepwise methods: the consensus (over literally decades of study) is, Don't use them. You can review the literature for reasons; the primary disadvantage is inflated Type I error, but there are other disadvantages as well.

 

My sense is that practitioners (in other words, non-statisticians) promote stepwise because it is SO EASY, you hardly have to give any thought to it at all. But easy is not the same thing as good or appropriate.

PaigeMiller
Diamond | Level 26

@sld wrote:

 

My sense is that practitioners (in other words, non-statisticians) promote stepwise because it is SO EASY, you hardly have to give any thought to it at all. But easy is not the same thing as good or appropriate.


In fact, if you take a class from SAS Institute about logistic regression, you hear this. The specific instructor that I heard made a point to mention that stepwise has all these drawbacks, but I was left with the impression that the instructor was advising the class to go ahead and use it anyway.

 

I believe a better solution is Logistic Partial Least Squares regression. Partial Least Squares is usually much more effective in the case of collinearity, and is available in SAS for continuous responses, but there is no logistic version available in SAS. You could probably program your own version of Logistic PLS using this paper: https://cedric.cnam.fr/fichiers/RC906.pdf

--
Paige Miller
StatDave
SAS Super FREQ

Better, more modern selection methods include LASSO, least angle regression (LAR), and elastic net. These methods are available in various SAS procedures as mentioned in this list of frequently asked-for statistics. For logistic models, the LASSO method is available in PROC HPGENSELECT. For more information see this note and the link there to an article by Gunes.

Melk
Lapis Lazuli | Level 10

Is LASSO appropriate for only really large datasets?

StatDave
SAS Super FREQ

No, as shown in the note I referred to, it just adds a penalty to the log likelihood to be maximized.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1728 views
  • 2 likes
  • 4 in conversation