04-12-2018 01:25 PM
I am working in a logistic regression using 'proc logistic'. One of the input variables is 'age' (from 18 to 78). I have created a new var 'age_levels' that is the result of making ranges from 'age' using a clustering, it has 4 levels.
Using proc freq I can see that there is a dependency between 'age_levels' and the target var of the model.
My doubt is wheter var I have to use in the model of proc logistic , 'age' or 'age_levels' as input var.
Can anybody help me??
04-12-2018 01:57 PM
You can use any variable(s) that you think are appropriate in such a model.
I usually avoid clustering of input variables such as age into four levels. This seems to me to throw away potentially useful information that is contained in age.
04-12-2018 03:50 PM