turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Binning a set of Continuous Variables using Percen...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-12-2016 09:22 AM

I have a SAS Programming Problem that you may have already solved:

My Data Set contains three sets of continuous variables:

DQ01 - DQ59 DE01 - DE59 & DL01 - DL59.

( 177 variables ) Each standardised with Mean = 50 and Variance = 100

The basic Statistical problem is Binary Logistic Regression.

1. I want to bin each continuous variable using deciles or semi-deciles

that have been computed using PROC Univariate / Summary.

2. Compute and output the Percentiles for each Variable.

3. For each variable compare the observed values with the Percentile

Cut-Points and then allocate that observation to a Decile Bin.

4. Optimise the Bin Allocation based on a metric such as the GINI.

5. Apply a Robust WOE Transformation to each Binned Variable.

subject to the following constraints:

a. The % frequency within each bin > 5%

b. The WOE transformation is Monotonic

6. Fit a Binary Logistic Regression Model to the WOE-Transformed Variables.

If you have any advice or suggestions w.r.t. the above please let me know.

Regards

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JonDickens1607

07-07-2016 08:26 PM

I think this is a little too big for a forum post. Also you posted it twice in 2 different forums.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JBerry

07-08-2016 06:31 AM

My main problem was how to process a large number of variables using the same binning algorithm.

I have constructed a solution for the binning process for a single variable using proc rank.

Now I need a maco possibly using arrays that enables me to repeat the process and combine the output into a table.

Has this reduced the problem sufficiently?