BookmarkSubscribeRSS Feed
am12scorp
Calcite | Level 5

I ran a glm regression with log link and gamma distribution for modeling impact of appropriate cancer care on costs. Covariates include age, race/ethnicity, location, tumor stage, tumor grade to name a few.

With 'no appropriate care' and 'stage 1' as reference categories for appropriate care and tumor stage variables, respectively, I get beta estimates of 9.0663, 0.6953, 0.6669 for intercept, non-appropriate care' and 'stage 2 tumor', respectively.  

When I change the reference category for tumor stage to 'stage 2', I get beta estimates of 8.7319, 0.6953, and -0.3288 for intercept, non-appropriate care, and 'stage 1 tumor' respectively.  
Even though the beta estimates for the key independent variable and other covariates remain the same, the beta estimate for intercept changes everytime I change the reference values of certain variables.  Why does this happen?  Would this not change the finding for key indepedent variable everytime I change the reference group for any covariate?

I would appreciate if you can help me with this and also guide with an appropriate reference.

Thank you in advance.

 

3 REPLIES 3
SteveDenham
Jade | Level 19

It is likely due to the non-full rank parameterization being used.  Changing the reference category sets that category estimate to zero, so that the intercept is the estimate for the reference category.  This only applies to categorical variables in the CLASS statement.

 

SteveDenham

am12scorp
Calcite | Level 5

Thank you for your response, SteveDenham!

I am not a statistics person.  Is there a simple explanation for this?  All the variables in the model are categorical variables and I have used 'class' statement for all of these.

 

Rick_SAS
SAS Super FREQ

The predicted values are the same, but the estimate for the intercept depends on the reference category. So when you change the reference level, you will see the intercept estimate change in a predictable way.

 

It might not be "simple," but you can read more about different CLASS parameterizations in this blog post: "Coding and simulating categorical variables in regression models"

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1122 views
  • 0 likes
  • 3 in conversation