turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Using continuous variables in proc logistic

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-24-2012 08:36 AM

s variables,

Hi,

I've built logistic regression models in SPSS previously. In SPSS I had access to CHAID, so I could split my continuous variables in to a number of categorical variables using chi squared test that calculates the optimum splits based on the response variable.

Unfortunately, I don't have SAS Miner, so I can't get access to CHAID here. I was wondering how you guys would handle a continuous variable, if you need to split it in to a categorical variable and how you are able to split it using an alternative procedure?

Thanks for your help.

Mo

Accepted Solutions

Solution

09-24-2012
08:53 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-24-2012 08:53 AM

Why discretize your data? In a short course at the FDA/ASA Biopharmaceutical section in 2011, Dr. Stephen Senn of the University of Glasgow pointed out that discretization is one of the most common errors requested from statisticians by clients (in this case he was referring to clinicians). The loss of power and interpretability were primary considerations. I know this doesn't answer your question, but rather raises a philosophical point. PROC LOGISTIC can certainly handle continuous independent variables, and if the continuous variable you wish to discretize is the dependent variable, there are a wealth of regression procedures available. For instance, PROC QUANTREG can provide regression models for various quantiles of the response variable.

Steve Denham

All Replies

Solution

09-24-2012
08:53 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-24-2012 08:53 AM

Why discretize your data? In a short course at the FDA/ASA Biopharmaceutical section in 2011, Dr. Stephen Senn of the University of Glasgow pointed out that discretization is one of the most common errors requested from statisticians by clients (in this case he was referring to clinicians). The loss of power and interpretability were primary considerations. I know this doesn't answer your question, but rather raises a philosophical point. PROC LOGISTIC can certainly handle continuous independent variables, and if the continuous variable you wish to discretize is the dependent variable, there are a wealth of regression procedures available. For instance, PROC QUANTREG can provide regression models for various quantiles of the response variable.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2012 11:41 AM

Hi Steve,

Thanks for pointing that out, that's great news!

Can I just explain what I am trying to do, so maybe you can helop me find the best way to do it in SAS as a newbie!

I am trying to predict a customers behaviour as a time dependent variable. So the response variable is that a customer is either:

0: Active

1: 1 month before Inactive

2: 2 months before Inactive

3: 3 months before Inactive

4: 4-6 months before Inactive

I am using the following statement :

(Max segment is by definition a categorical variable.)

proc logistic data = churn.ModelDataSet_Training;

class Max_Segment;

model response (ref = '0') = Max_Segment total_turnover Total_PNL Total_Trades Avg_DaysDiff_NextTrade ACCOUNT_VALUE_GBP EOD_CASH_BALANCE_GBP TOTAL_PAYMENTS_GBP TOTAL_PAYMENTS_IN_GBP TOTAL_PAYMENTS_OUT_GBP Nr_Logins Nr_Devices avg_sessiontime Win_ratio DaysBetweenPaymentsIn DaysBetweenPaymentsOut Max_Consecutive_loss_days Turnover_ratio total_pnl_ratio total_trades_ratio Avg_DDiff_NextTrade_ratio Account_value_ratio Eod_Cash_Balance_ratio Total_Payments_ratio Total_Payments_In_ratio Total_Payments_Out_ratio Nr_Logins_ratio Win_pct_ratio Nr_Devices_Ratio avg_sessiontime_ratio DBetweenPayInRatio DaysBetweenPaymentsOut_ratio max_consec_lossd_ratio

/link = glogit selection = forward itprint maxstep = 100 maxiter = 100 rsquare ;

run;

There may be a better way to fit the predictors to the response, like you say to maybe have a continuous response variable, though I do need to split the Active/Inactive seperately.

I look forward to hearing your expertise!

Thanks

Mo