Best-subset instead of stepwise question.
Hello, I have classes of individuals grouped together from cluster analysis. I want to use discriminant analysis to determine group membership of new individuals based on a set of predictors. Normally, I use PROC STEPDISC to find a subset of predictors that go into the discriminant analysis, something like:
proc stepdisc data=training sle=0.05 singular=0.1;
class group;
var VAR1--VAR25
run;
However, recent literature indicates stepwise selection is not as good as evaluating all possible subsets of predictors. Is there a procedure, or otherwise, that can do this? I have looked at PHREG REG and LOGISTIC procedures, but they all seem to be based on numerical data rather than classes. Have I missed something? or should I just convert the group data from text to numerical?
Thanks in advance.
peat
Best variable subset selection isn't available in PROC STEPDISC. If you have only two groups or if you want to explore group differences two groups at a time, you can perform best variable subset selection in PROC LOGISTIC
title "Discriminating groups A and B";
proc logistic data=training(where=(group in ("A", "B")));
class group;
model group(event="B") = VAR1 -- VAR25 / selection=score best=3 stop=5;
run;
PG
Hi PG, and thanks for the response. I actually have 4 groups (sometimes more). It looks like I can just use:
proc logistic data=training;
class group;
model group= VAR1 -- VAR25 / selection=score best=3 stop=5;
run;
This is very helpful. However, is there a way to compare the output models for overfitting? e.g. are four preditors really better than three.
Cheers,
peat
Peat,
Probably the best way to address overfitting is with Bootstrapping. there is a substantial literature on it.
Doc Muhlbaier
Duke
Thanks Duke, I will look into it.
peat
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.