BookmarkSubscribeRSS Feed
chemicalab
Fluorite | Level 6

Hi all,

My question as in the topic header is aiming to understand which form of variable is best to use in logistic regression and why?

Should the variables be continuous? or should i be focusing with binning and categorical variables?

What are the pros and cons for each situation?

Thanks in advance

3 REPLIES 3
jwexler
SAS Employee

Hi Chemicalab.  It depends on the purpose of your model. If this is for a Credit Scorecard, then bins are often created when explaining why people were rejected.  Using Bins allows you to work with the assumptions of linear relationships, even if a non-linear relationship exists with your continuous variable. You may lose some ability to differentiate your continuous variables (is 21 different than 24 years old?), so you need to test your assumptions. I advise you to try both ways! Thanks, Jonathan

chemicalab
Fluorite | Level 6

Sounds like what i had in mind , thank you for the clarification Jonathan

fri0
Quartz | Level 8

I know that categorical variables help you because they may contain important information for your model, but regression is a technique for continuous variables preferably. If you need to use a main categorical variable for your project, you can do it, but you have to keep in mind that categorical variables should be minimum. If all your data is categorical, you maybe could think in use other techniques created specially for this kind of variable as ordinal regression for example.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1220 views
  • 0 likes
  • 3 in conversation