BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zzzyyy
Fluorite | Level 6

my variable pool contains both categorical variables and continuous variable. I plan to use /selection=stepwise to do the stepwise variable selection.

 

For categorical variables, is it enough to specify them in class statement? will the selection process treated different levels of a categorical variable as one variable? Or I need to specify categorical variable like 'groupnames = 'Height' 'Age' in the proc reg variable selection?

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Unless you specify the SPLIT option on the CLASS statement in PROC GLMSELECT or PROC HPGENSELECT, the whole variable will either be included or excluded. PROC LOGISTIC does not support the SPLIT option, so again the whole variable is selected or not.

View solution in original post

6 REPLIES 6
Reeza
Super User

will the selection process treated different levels of a categorical variable as one variable

 

Yes, it will. If you want to treat them each individually you need to make your own dummy variables and in general it's not statistically recommended anyways. 

 


@zzzyyy wrote:

my variable pool contains both categorical variables and continuous variable. I plan to use /selection=stepwise to do the stepwise variable selection.

 

For categorical variables, is it enough to specify them in class statement? will the selection process treated different levels of a categorical variable as one variable? Or I need to specify categorical variable like 'groupnames = 'Height' 'Age' in the proc reg variable selection?


 

zzzyyy
Fluorite | Level 6

I don't want to treat them  individually, I want different levels be treated as a whole group. So is proc logistic class statement enough to 

 restrict the variable selection method so that a group of variables enters or leaves the model together?

Rick_SAS
SAS Super FREQ

Can you clarify? Your title says 'logistic' but you mention "PROC REG."

 

For linear regression models, look at PROC GLMSELECT.

For logistic regression models, look at PROC HPGENSELECT.

zzzyyy
Fluorite | Level 6

since I used to use proc reg and groupname option to restrict the effect selection method so that a group of variables enters or leaves the model together. Now I'm going to use proc logistic, I'm wondering will class statement reach the same result as groupname during variable selection process.

Rick_SAS
SAS Super FREQ

Unless you specify the SPLIT option on the CLASS statement in PROC GLMSELECT or PROC HPGENSELECT, the whole variable will either be included or excluded. PROC LOGISTIC does not support the SPLIT option, so again the whole variable is selected or not.

zzzyyy
Fluorite | Level 6

got it, thanks!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2496 views
  • 0 likes
  • 3 in conversation