BookmarkSubscribeRSS Feed
juanvg1972
Pyrite | Level 9

Hi,

 

I am working in a logistic regression using 'proc logistic'. One of the input variables is 'age' (from 18 to 78). I have created a new var 'age_levels' that is the result of making ranges from 'age' using a clustering, it has 4 levels.

 

Using  proc freq I can see that there is a dependency between 'age_levels' and the target var of the model.

My doubt is wheter var I have to use in the model of proc logistic , 'age' or 'age_levels' as input var.

 

Can anybody help me??

 

Thanks

2 REPLIES 2
PaigeMiller
Diamond | Level 26

You can use any variable(s) that you think are appropriate in such a model.

 

I usually avoid clustering of input variables such as age into four levels. This seems to me to throw away potentially useful information that is contained in age.

--
Paige Miller
Reeza
Super User
When you use age as a categorical variable you're saying that the boundaries reflect some grouping. So if it makes sense that a 41 year old has different outcomes than a 39 year old, assuming one age cut off is 40, then use the categories. Otherwise the general consensus is don't group the variables unless there's some reason to do so.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1163 views
  • 2 likes
  • 3 in conversation