Programming the statistical procedures from SAS

Binning a large number of Continuous Variable using Percentiles or other Cut-Points

Reply
Contributor
Posts: 59

Binning a large number of Continuous Variable using Percentiles or other Cut-Points

I have a SAS Programming Problem that you may have already solved:
 
My Data Set contains three sets of continuous variables: 
 
DQ01 - DQ59  DE01 - DE59  &  DL01 - DL59. 
 
( 177 variables ) Each standardised with Mean = 50 and Variance  = 100
 
The basic Statistical problem is Binary Logistic Regression.
 
1. I want to bin each continuous variable using deciles or semi-deciles 
    that have been computed using PROC Univariate / Summary.
 
2. Compute and output the Percentiles for each Variable.
 
3. For each variable compare the observed values with the Percentile 
    Cut-Points and then allocate that observation to a Decile Bin.
 
4. Optimise the Bin Allocation based on a metric such as the GINI.
 
5. Apply a Robust WOE Transformation to each Binned Variable.
   subject to the following constraints:
   a. The % frequency within each bin > 5%
   b. The WOE transformation is Monotonic 
 
6. Fit a Binary Logistic Regression Model to the WOE-Transformed Variables.
 
If you have any advice or suggestions w.r.t. the above please let me know.
 
Regards
Ask a Question
Discussion stats
  • 0 replies
  • 192 views
  • 0 likes
  • 1 in conversation