BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tpakhomova
Fluorite | Level 6

I am using PROC GLMSELECT for a multiple linear regression model that has categorical variables, which have more than 2 levels, as explanatory variables. I am examining the relationship between stress scores and sexual health variables. I am pretty new to SAS so need some help determining if I am coding this correctly, and if my interpretation is correct. 

Here is my model:


proc GLMSELECT data=baseline;
class gender;
model PSS_score = sexual_orient ever_preg cons_sex_age partners_6M CES_D_dep
/ selection=stepwise select=SL showpvalues SLE=0.05 ;
title "Stepwise Regression SRH for OVERALL, 0.05";
run;

 

1) Does this model look like its coded alright?

2) When I runs this, how exactly do I interpret the results? For example - my variable "cons_sex_age" (age at first sex) has three levels (never had sex, 15 and under, 16 and over), coded as 0, 1, 2. How do I interpret the relationship between stress score and "cons_sex_age"? Would my reference category be what is coded as 0? AKA the parameter estimate would be the relationship between stress score and the "15 and under", which is coded as 0. Or do I need to create dummy variables for GLMSELECT?

 

I have found the SAS guidance notes on this quite confusing. 

 

Thanks, 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1.  No I do not think your code does what you intend, though it is correct, syntax wise. I think you want the REF and PARAM option on the CLASS statement for starters. Otherwise the default is GLM and you should check the design matrix which is outputted to see how it’s dummy coded. 

2. Once you’ve made the changes above you can pick which level is the reference level. You do not need to dummy code it. 

Most of those are better explained in the LOGISTIC regression procedure so maybe finding some good example of that is an easier starting point? 


@tpakhomova wrote:

I am using PROC GLMSELECT for a multiple linear regression model that has categorical variables, which have more than 2 levels, as explanatory variables. I am examining the relationship between stress scores and sexual health variables. I am pretty new to SAS so need some help determining if I am coding this correctly, and if my interpretation is correct. 

Here is my model:


proc GLMSELECT data=baseline;
class gender;
model PSS_score = sexual_orient ever_preg cons_sex_age partners_6M CES_D_dep
/ selection=stepwise select=SL showpvalues SLE=0.05 ;
title "Stepwise Regression SRH for OVERALL, 0.05";
run;

 

1) Does this model look like its coded alright?

2) When I runs this, how exactly do I interpret the results? For example - my variable "cons_sex_age" (age at first sex) has three levels (never had sex, 15 and under, 16 and over), coded as 0, 1, 2. How do I interpret the relationship between stress score and "cons_sex_age"? Would my reference category be what is coded as 0? AKA the parameter estimate would be the relationship between stress score and the "15 and under", which is coded as 0. Or do I need to create dummy variables for GLMSELECT?

 

I have found the SAS guidance notes on this quite confusing. 

 

Thanks, 

 

 


 

View solution in original post

2 REPLIES 2
Reeza
Super User

1.  No I do not think your code does what you intend, though it is correct, syntax wise. I think you want the REF and PARAM option on the CLASS statement for starters. Otherwise the default is GLM and you should check the design matrix which is outputted to see how it’s dummy coded. 

2. Once you’ve made the changes above you can pick which level is the reference level. You do not need to dummy code it. 

Most of those are better explained in the LOGISTIC regression procedure so maybe finding some good example of that is an easier starting point? 


@tpakhomova wrote:

I am using PROC GLMSELECT for a multiple linear regression model that has categorical variables, which have more than 2 levels, as explanatory variables. I am examining the relationship between stress scores and sexual health variables. I am pretty new to SAS so need some help determining if I am coding this correctly, and if my interpretation is correct. 

Here is my model:


proc GLMSELECT data=baseline;
class gender;
model PSS_score = sexual_orient ever_preg cons_sex_age partners_6M CES_D_dep
/ selection=stepwise select=SL showpvalues SLE=0.05 ;
title "Stepwise Regression SRH for OVERALL, 0.05";
run;

 

1) Does this model look like its coded alright?

2) When I runs this, how exactly do I interpret the results? For example - my variable "cons_sex_age" (age at first sex) has three levels (never had sex, 15 and under, 16 and over), coded as 0, 1, 2. How do I interpret the relationship between stress score and "cons_sex_age"? Would my reference category be what is coded as 0? AKA the parameter estimate would be the relationship between stress score and the "15 and under", which is coded as 0. Or do I need to create dummy variables for GLMSELECT?

 

I have found the SAS guidance notes on this quite confusing. 

 

Thanks, 

 

 


 

tpakhomova
Fluorite | Level 6
Thank you very much Reeza. Quick follow-up question-

So if I have the following:

proc GLMSELECT data=baseline;
class gender /PARAM=REF REF=first;
model PSS_score = sexual_orient ever_preg cons_sex_age partners_6M CES_D_dep
/ selection=stepwise select=SL showpvalues SLE=0.05 ;
title "Stepwise Regression SRH for OVERALL, 0.05";
run;

Would REF=first mean that the level coded 0 would be the reference? (if my levels are 0-2). Or would it make sense for my levels to be coded 1-3

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 879 views
  • 1 like
  • 2 in conversation