Programming the statistical procedures from SAS

Is it a sensible practice to collapse categories of a predictor?

Reply
Contributor
Posts: 52

Is it a sensible practice to collapse categories of a predictor?

I used logistic regression to run a model. An explanatory variable called 'location' has 3 levels (1, 2, 3). Level 3 is the reference group. For this analysis, the estimated regression coefficients for level 1 and level 2 are 1.102 and 1.111 respectively. As the values are very close, is it sensible to combine these two levels into a single level to make the model simpler? Or it is better to keep the two levels as what they are separate?

Super User
Posts: 18,603

Is it a sensible practice to collapse categories of a predictor?

Depends on the context of the data.

If for example level 1 is age<30 and level 2 is age between 31 and 60 and level 3 is age >60 then you're simply recoding to age <30 and age>30 which is okay, as long as you don't introduce a bias into your data.

If you're combining things that don't make sense ie level 1 is unknown and level 2 is Grade 1 then there it doesn't make sense.

Basically, check if it makes logistic sense to collapse them from a business or interpretative perspective and check if the distribution is significantly different with the predictor ( a chi square usually works) to make sure you interpret things correctly.

I work in Health Care and we routinely do this.

Ask a Question
Discussion stats
  • 1 reply
  • 107 views
  • 0 likes
  • 2 in conversation