Programming the statistical procedures from SAS

Input Variables in Logistic Regression

Reply
Frequent Contributor
Posts: 137

Input Variables in Logistic Regression

Hi,

 

I am working in a logistic regression using 'proc logistic'. One of the input variables is 'age' (from 18 to 78). I have created a new var 'age_levels' that is the result of making ranges from 'age' using a clustering, it has 4 levels.

 

Using  proc freq I can see that there is a dependency between 'age_levels' and the target var of the model.

My doubt is wheter var I have to use in the model of proc logistic , 'age' or 'age_levels' as input var.

 

Can anybody help me??

 

Thanks

Respected Advisor
Posts: 3,055

Re: Input Variables in Logistic Regression

Posted in reply to juanvg1972

You can use any variable(s) that you think are appropriate in such a model.

 

I usually avoid clustering of input variables such as age into four levels. This seems to me to throw away potentially useful information that is contained in age.

--
Paige Miller
Super User
Posts: 23,771

Re: Input Variables in Logistic Regression

Posted in reply to juanvg1972
When you use age as a categorical variable you're saying that the boundaries reflect some grouping. So if it makes sense that a 41 year old has different outcomes than a 39 year old, assuming one age cut off is 40, then use the categories. Otherwise the general consensus is don't group the variables unless there's some reason to do so.
Ask a Question
Discussion stats
  • 2 replies
  • 79 views
  • 2 likes
  • 3 in conversation