SAS Procedures

Help using Base SAS procedures
BookmarkSubscribeRSS Feed
peatjohnston
Calcite | Level 5

Best-subset instead of stepwise question.

Hello, I have classes of individuals grouped together from cluster analysis. I want to use discriminant analysis to determine group membership of new individuals based on a set of predictors. Normally, I use PROC STEPDISC to find a subset of predictors that go into the discriminant analysis, something like:

proc stepdisc     data=training sle=0.05 singular=0.1;

     class group;

     var VAR1--VAR25

run;

However, recent literature indicates stepwise selection is not as good as evaluating all possible subsets of predictors. Is there a procedure, or otherwise, that can do this? I have looked at PHREG REG and LOGISTIC procedures, but they all seem to be based on numerical data rather than classes. Have I missed something? or should I just convert the group  data from text to numerical?

Thanks in advance.

peat

4 REPLIES 4
PGStats
Opal | Level 21

Best variable subset selection isn't available in PROC STEPDISC. If you have only two groups or if you want to explore group differences two groups at a time, you can perform best variable subset selection in PROC LOGISTIC

title "Discriminating groups A and B";

proc logistic data=training(where=(group in ("A", "B")));

class group;

model group(event="B") = VAR1 -- VAR25 / selection=score best=3 stop=5;

run;

PG

PG
peatjohnston
Calcite | Level 5

Hi PG, and thanks for the response. I actually have 4 groups (sometimes more). It looks like I can just use:

proc logistic data=training;

class group;

model group= VAR1 -- VAR25 / selection=score best=3 stop=5;

run;

This is very helpful. However, is there a way to compare the output models for overfitting? e.g. are four preditors really better than three.

Cheers,

peat

Doc_Duke
Rhodochrosite | Level 12

Peat,

Probably the best way to address overfitting is with Bootstrapping.  there is a substantial literature on it.

Doc Muhlbaier

Duke

peatjohnston
Calcite | Level 5

Thanks Duke, I will look into it.

peat

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1779 views
  • 3 likes
  • 3 in conversation