<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Binning (categorize continuous var into categories) in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/937532#M368378</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp; et al!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;&amp;nbsp; suggested, you an use PROC OPTMODEL to create monotonic bins that meet your criteria and maximize IV (also known as symmetric Kullback-Leibler Divergence).&amp;nbsp; I posted some code to do that five years ago:&lt;BR /&gt;&lt;A href="https://communities.sas.com/t5/Mathematical-Optimization/Trying-to-use-PROC-OPTMODEL-for-monotonic-supervised-optimal/m-p/553822" target="_blank"&gt;Trying to use PROC OPTMODEL for monotonic supervised optimal binning o... - SAS Support Communities&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It uses integer programming and it's not very efficient, and it can take a very long time to run, even on small problems.&amp;nbsp; That is the nature of combinatorial optimization.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have attached some code that uses a slightly less optimal two-stage approach.&amp;nbsp; The first stage is an isotonic regression, which creates the closest fitting monotonic sequence from the original data; you can think of it as very granular binning, for which you have no control over the number or sizes of the bins.&amp;nbsp; The second stage uses a dynamic programming approach to create the bins with the properties you require (number and sizes of bins, but you'd have to add the constraint on the minimum event number in each bin).&amp;nbsp; The dynamic programming doesn't run into the same computational issues as the integer programming approach, but it can't impose monotonicity; that's why you precede it with the isotonic regression, which forces the subsequent bins to remain monotonic.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The five files run in the following order:&lt;/P&gt;&lt;P&gt;1. _0_import_german_credit.csv_.txt reads in the data&lt;/P&gt;&lt;P&gt;2.&amp;nbsp;_1_agg_gc_credit_amount_.txt aggregates to a single observation for each unique predictor variable value&lt;/P&gt;&lt;P&gt;3.&amp;nbsp;_3_agg_ds_for_optmodel_.txt adds some cumulative variables to the aggregated data&lt;/P&gt;&lt;P&gt;4.&amp;nbsp;_5_isoreg_decrease_data_step_.txt runs the isotonic regression on the aggregated data&lt;/P&gt;&lt;P&gt;5.&amp;nbsp;_100_gcdp011_post_isoreg_4+-gps_40-minsz_ivx_.txt runs the dynamic programming algorithm on the output of the isotonic regression.&amp;nbsp; In this case, it seeks at least four bins with a minimum size of forty records, and maximum IV.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I hope you find this helpful.&lt;/P&gt;</description>
    <pubDate>Mon, 29 Jul 2024 20:31:15 GMT</pubDate>
    <dc:creator>Top_Katz</dc:creator>
    <dc:date>2024-07-29T20:31:15Z</dc:date>
    <item>
      <title>Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936719#M368148</link>
      <description>&lt;P&gt;Hello.&lt;/P&gt;
&lt;P&gt;Let's say that I have a data set with many&amp;nbsp; continuous variables (independent variables) and response variable (1/0 Default yes or no).&lt;/P&gt;
&lt;P&gt;Let's say that for specific variable X&amp;nbsp; I want to categorize this variable into categories .&lt;/P&gt;
&lt;P&gt;I have some criteria:&lt;/P&gt;
&lt;P&gt;1-Classify to 4 or 5 categories&lt;/P&gt;
&lt;P&gt;2-Category 1 is the worse category (The category with highest bads rate)&lt;/P&gt;
&lt;P&gt;3-In each category be at least 50 bads (people with Y=1)&lt;/P&gt;
&lt;P&gt;4-There must be cardinally in failure rate between groups&lt;/P&gt;
&lt;P&gt;5-Must have at least 1% increasing in bads rate from one category to another category.&lt;/P&gt;
&lt;P&gt;6-If there are some possible ways that meet 1-5 then choose the option with highest Information value.&lt;/P&gt;
&lt;P&gt;May anyone show a nice code that perform this?&lt;/P&gt;
&lt;P&gt;What way would you use to perform the binning based on the criteria's I mentioned?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is SAS code that show If the conditions I mentioned&amp;nbsp; were meet.&lt;/P&gt;
&lt;P&gt;Here can see also IV and see default rate in each category.&lt;/P&gt;
&lt;P&gt;As I said, the magic question is how to binning the categories?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
PROC FORMAT ;
VALUE IV_fmt
low-&amp;lt;0.02='Very weak'
0.02 &amp;lt;-0.1 ='Weak'
0.1 &amp;lt;-0.3  ='Intermediate'
0.3 &amp;lt;- HIGH ='Strong'
;
RUN;

%macro inf_value(Raw_Data_tbl,var,Response_VAR);
PROC SUMMARY DATA=&amp;amp;Raw_Data_tbl. (keep= &amp;amp;var. &amp;amp;Response_VAR.);
VAR  &amp;amp;Response_VAR.;
OUTPUT OUT=inf0 (DROP=_TYPE_)sum=all_bad;
RUN;

data _null_;
set inf0;
good=_freq_-all_bad;
call symput('all_good',good);
call symput('all_bad',all_bad);
run;
%put &amp;amp;all_good;
%put &amp;amp;all_bad;

PROC SUMMARY DATA=&amp;amp;Raw_Data_tbl. (keep= &amp;amp;var. &amp;amp;Response_VAR.)nway;
class &amp;amp;var.;
VAR  &amp;amp;Response_VAR.;
OUTPUT OUT=inf1 (DROP=_TYPE_ rename=(_freq_=Nr_Customers))sum=Nr_bad;
RUN;

data inf2;
set inf1;
Nr_good=Nr_Customers-Nr_bad;
if &amp;amp;all_bad. not in (0,.) then PCT_Bads=Nr_bad/&amp;amp;all_bad.;
if &amp;amp;all_good. not in (0,.) then PCT_Goods=Nr_good/&amp;amp;all_good.;
res_helkey_non=PCT_Bads/PCT_Goods;
woe=log(res_helkey_non);**weight of evidance***;
iv_categ=(PCT_Bads-PCT_Goods)*woe;
run;

PROC SUMMARY DATA= inf2;
VAR iv_categ;
OUTPUT OUT=inf3 (DROP=_TYPE_)sum=inf_val;
RUN;

data _null_;
set inf3;
call symputx('inf_val',inf_val);
run;
%put &amp;amp;inf_val;/***SUM IV over all categories***/

data inf4;
length var $100. categ $100.;
set inf2;
categ=compress(&amp;amp;var.);
drop &amp;amp;var.;
var="&amp;amp;var.";
P_Default=Nr_bad/Nr_Customers;
label iv='information value(iv)';
run;

data inf5;
retain 
var
categ
Nr_Customers
Nr_bad
Nr_good
PCT_bad
PCT_Bads
PCT_Goods
P_Default
res_helkey_non
woe
iv_categ
;
set inf4; 
format PCT_Bads PCT_Goods res_helkey_non woe iv_categ P_Default percent10.3;
format Nr_Customers Nr_bad Nr_good comma15.  ;
run;


PROC SUMMARY DATA= inf5;
VAR Nr_Customers Nr_bad Nr_good  PCT_Bads  PCT_Goods  iv_categ;
OUTPUT OUT=Summary_IV (DROP=_TYPE_ _freq_) sum=;
RUN;

Data Summary_IV_b;
Retain  field category ;
set Summary_IV;
VAR="&amp;amp;var.";
P_Default=Nr_bad/Nr_Customers;
format P_Default percent8.2;
Run;

Data Want;
set inf5  Summary_IV_b;
IF  missing(categ)  then IV_desc=put(iv_categ,IV_fmt.);
else IV_desc='';
Run;

title;
proc print data=Want noobs;Run;
%mend inf_value;
%inf_value(Raw_Data_tbl=ttt,var=X,Response_VAR=Ind_Default)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2024 07:35:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936719#M368148</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-23T07:35:23Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936724#M368149</link>
      <description>&lt;P&gt;You are building a Score Card ?&lt;/P&gt;
&lt;P&gt;The following is the code I used before for this purpose.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/*
proc import datafile='c:\temp\1--German Credit.xlsx' dbms=xlsx out=have replace;
run;
*/

%let var=weight;  *the continuous variable you need to split;
%let group=4 ;    *the number of group you want to bin to;
%let n_iter=100;  *the number of iteration for Genetic Algorithm to get best WOE IV;


data temp;
 set sashelp.heart(obs=200);
 good_bad=ifc(status='Alive','good','bad ');
 if not missing(&amp;amp;var);
 keep &amp;amp;var good_bad ;
run;


proc sql noprint;
 select sum(good_bad='bad'),sum(good_bad='good'),
        floor(min(&amp;amp;var)),ceil(max(&amp;amp;var)) into : n_bad,: n_good,: min,: max
  from temp;
quit;
%put &amp;amp;n_bad &amp;amp;n_good &amp;amp;min &amp;amp;max;
proc sort data=temp;by &amp;amp;var ;run;
proc iml;
use temp(where=(&amp;amp;var is not missing));
read all var {&amp;amp;var good_bad};
close;

start function(x) global(bin,&amp;amp;var ,good_bad,group,woe);
if countunique(x)=group-1 then do;

col_x=t(x);
call sort(col_x,1);
cutpoints= .M//col_x//.I ;
b=bin(&amp;amp;var ,cutpoints,'right');

if countunique(b)=group  then do;
do i=1 to group;
 idx=loc(b=i);
 temp=good_bad[idx];
 n_bad=sum(temp='bad');
 n_good=sum(temp='good');
 bad_dist=n_bad/&amp;amp;n_bad ; 
 good_dist=n_good/&amp;amp;n_good ; 
 if Bad_Dist&amp;gt;0.05 &amp;amp; Good_Dist&amp;gt;0.05  then woe[i]=log(Bad_Dist/Good_Dist);
  else woe[i]=.;
end;

if countmiss(woe)=0 then do;
/*
xx=j(group,1,1)||woe||woe##2;
*/
xx=j(group,1,1)||woe;
beta=solve(xx`*xx,xx`*bin);
yhat=xx*beta;
sse=ssq(bin-yhat);
end;
else sse=999999;

end;
else sse=999999;

end;
else sse=999999;

return (sse);
finish;

group=&amp;amp;group ;  
bin=t(1:group);
woe=j(group,1,.);



encoding=j(2,group-1,&amp;amp;min );
encoding[2,]=&amp;amp;max ;    

id=gasetup(2,group-1,123456789);
call gasetobj(id,0,"function");
call gasetsel(id,10,1,1);
call gainit(id,1000,encoding);


niter =  &amp;amp;n_iter ;
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);

col_mem=t(mem);
call sort(col_mem,1);
cutpoints= .M//col_mem//.I ;
b=bin(&amp;amp;var ,cutpoints,'right');

create cutpoints var {cutpoints};
append;
close;
create group var {b};
append;
close;

print value[l = "Min Value:"] ;
call gaend(id);
quit;


data all_group;
 set temp(keep=&amp;amp;var rename=(&amp;amp;var=b) where=(b is missing)) group;
run;
data all;
 merge all_group temp;
 rename b=group;
run;




title "变量: &amp;amp;var" ;
proc sql;
create table woe_&amp;amp;var as
 select group label=' ',
min(&amp;amp;var) as min label='最小值',max(&amp;amp;var) as max label='最大值',count(*) as n label='频数',
calculated n/(select count(*) from all) as per format=percent7.2 label='占比',
sum(good_bad='bad') as n_bad label='bad的个数',sum(good_bad='good') as n_good label='good的个数',
sum(good_bad='bad')/(select sum(good_bad='bad') from all ) as bad_dist label='bad的占比',
sum(good_bad='good')/(select sum(good_bad='good') from all ) as good_dist label='good的占比',
log(calculated Bad_Dist/calculated Good_Dist) as woe
from all
   group by group
    order by woe;

create index group on woe_&amp;amp;var;

select *,sum(  (Bad_Dist-Good_Dist)*woe  ) as iv
 from woe_&amp;amp;var ;

quit;
title ' ';




/*
data fmt_&amp;amp;var ;
 set cutpoints;
 start=lag(cutpoints);
 end=cutpoints;
 if start=.M then hlo='IL';
 if end=.I then hlo='IH';
 if _n_ ne 1 then do;group+1;output;end;
run;
data fmt_&amp;amp;var(index=(group));
 merge  fmt_&amp;amp;var woe_&amp;amp;var(keep=group woe);
 by group;
 retain fmtname "&amp;amp;var" type 'I';
 keep group fmtname type start end woe hlo;
 rename woe=label;
 label group=' ';
run;
proc format cntlin=fmt_&amp;amp;var library=z;
run;



proc print data=woe_&amp;amp;var noobs label;run;
proc sgplot data=woe_&amp;amp;var;
reg y=group x=woe/degree=2 cli clm jitter;
run;
*/
proc sgplot data=woe_&amp;amp;var noautolegend;
 vbar group/response=woe nostatlabel missing;
 vline group/response=woe nostatlabel missing markers MARKERATTRS=(symbol=circlefilled 
  size=12) MARKERFILLATTRS=(color=white) MARKEROUTLINEATTRS=graphdata1
  FILLEDOUTLINEDMARKERS;
run;


&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Ksharp_0-1721721809534.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/98616iA20AED15A073F3EF/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Ksharp_0-1721721809534.png" alt="Ksharp_0-1721721809534.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Ksharp_1-1721721838823.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/98617i482089B9C4FA67A4/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Ksharp_1-1721721838823.png" alt="Ksharp_1-1721721838823.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2024 08:04:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936724#M368149</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-23T08:04:09Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936725#M368150</link>
      <description>&lt;P&gt;&amp;nbsp;Thank you so so much!&lt;/P&gt;
&lt;P&gt;Yes, build credit risk model (ScoreCard).&lt;/P&gt;
&lt;P&gt;Can I ask please regarding the criteria I mentioned:&lt;/P&gt;
&lt;P&gt;Classify to 4 or 5 categories&amp;nbsp; ----&lt;FONT color="#FF0000"&gt;Defined in macro var %let group&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Category 1 is the worse category (The category with highest bads rate)&lt;STRONG&gt;&lt;FONT color="#FF0000"&gt;--Where did you define it?&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;In each category be at least 50 bads (people with Y=1)&lt;STRONG&gt;&lt;FONT color="#FF0000"&gt;--Where did you define it?&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Must have at least 1% increasing in bads rate from one category to another category.&lt;STRONG&gt;&lt;FONT color="#FF0000"&gt;--Where did you define it?&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2024 08:24:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936725#M368150</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-23T08:24:00Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936728#M368151</link>
      <description>Classify to 4 or 5 categories  ----Defined in macro var %let group&lt;BR /&gt;Answer: Yes.&lt;BR /&gt;&lt;BR /&gt;Category 1 is the worse category (The category with highest bads rate)--Where did you define it?&lt;BR /&gt;Answer: Here.Category 4 is worse , You could see Category 4 has the lowest WOE, If you want Category 1 be, you could change group number reversely.&lt;BR /&gt;But that did not mean the Category 4 has the highest bads rate. I did not consider this criteria . I only make the IV be the highest.&lt;BR /&gt;&lt;BR /&gt;In each category be at least 50 bads (people with Y=1)--Where did you define it?&lt;BR /&gt;Answer: If you have more obs  than 200 ,this criteria should be meet.But I don't guarantee it ,since I only make the IV be the highest.&lt;BR /&gt;&lt;BR /&gt;Must have at least 1% increasing in bads rate from one category to another category.--Where did you define it?&lt;BR /&gt;Answer: If you have more data,that would be happend.But I don't guarantee it ,since I only make the IV be the highest.</description>
      <pubDate>Tue, 23 Jul 2024 08:54:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936728#M368151</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-23T08:54:54Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936731#M368152</link>
      <description>&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;If in data set have 100,000 rows.&lt;/P&gt;
&lt;P&gt;Then how long approximately&amp;nbsp; &amp;nbsp;should&amp;nbsp; it take to run the 100 iterations for one var only?&lt;/P&gt;
&lt;P&gt;Which code should Modify that worse group be 1 and best group be 4 ( define 4 groups)?&lt;/P&gt;
&lt;P&gt;Can you show the code please?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2024 09:06:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936731#M368152</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-23T09:06:30Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936733#M368153</link>
      <description>If you need get these criteria , I suggest you to post it at OR forum:&lt;BR /&gt;&lt;A href="https://communities.sas.com/t5/Mathematical-Optimization/bd-p/operations_research" target="_blank"&gt;https://communities.sas.com/t5/Mathematical-Optimization/bd-p/operations_research&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;And calling out &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/1636"&gt;@RobPratt&lt;/a&gt;</description>
      <pubDate>Tue, 23 Jul 2024 09:11:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936733#M368153</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-23T09:11:14Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936736#M368154</link>
      <description>You could test it by yourself. But I think that would be very slowly, at least 3 minutes I guesss.&lt;BR /&gt;Here change group number:&lt;BR /&gt;data all;&lt;BR /&gt; merge all_group temp;&lt;BR /&gt;b=5-b;  /*&amp;lt;----------*/&lt;BR /&gt; rename b=group;&lt;BR /&gt;run;</description>
      <pubDate>Tue, 23 Jul 2024 09:16:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936736#M368154</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-23T09:16:37Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936740#M368155</link>
      <description>&lt;P&gt;I run it more than 2 hours.......still didnt finish (Run for one variable only)&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2024 09:48:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936740#M368155</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-23T09:48:29Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936741#M368156</link>
      <description>&lt;P&gt;you mean&amp;nbsp;&lt;SPAN&gt;3 minutes for each iteration?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;so for 100 iterations will have 300 minutes?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2024 09:55:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936741#M368156</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-23T09:55:37Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936756#M368158</link>
      <description>Sorry I didn't understand.  Which criteria ( what question) do you recommend me to post in this OR forum? What is the purpose of that forum please?</description>
      <pubDate>Tue, 23 Jul 2024 13:26:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936756#M368158</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-23T13:26:36Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936895#M368171</link>
      <description>&lt;P&gt;Yes. 3 minutes for one variable.&lt;BR /&gt;Since you tested it for more than two hours , try to reduce %let n_iter=100 .&lt;/P&gt;
&lt;P&gt;Or get more people to run code, each for one variable?&lt;/P&gt;
&lt;P&gt;P.S. I do not recommend to reduce "%let n_iter= 100", and it is other way around, I would have more than 100 to get better IV .&lt;/P&gt;
&lt;P&gt;In summary, if you want fastest way to solve your question, post your question at OR forum and calling&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/1636"&gt;@RobPratt&lt;/a&gt; .&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2024 00:46:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936895#M368171</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-24T00:46:57Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936896#M368172</link>
      <description>All the conditions you mentioned ----&amp;gt; "I have some criteria:"&lt;BR /&gt;&lt;BR /&gt;SAS/OR is for solving a optimal problem,&lt;BR /&gt;since your question looks like searching a optimal value ,you could try to get helpl from &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/1636"&gt;@RobPratt&lt;/a&gt; &lt;BR /&gt;Try post your question at OR forum.&lt;BR /&gt;</description>
      <pubDate>Wed, 24 Jul 2024 00:17:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936896#M368172</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-24T00:17:17Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936915#M368177</link>
      <description>&lt;P&gt;I run your original code:&lt;/P&gt;
&lt;P&gt;In data set&amp;nbsp;CUTPOINTS there&amp;nbsp; is one column called&amp;nbsp;CUTPOINTS (numeric var with no format ).&lt;/P&gt;
&lt;P&gt;How come I see letter values when the column is numeric??&lt;/P&gt;
&lt;P&gt;What is the meaning of values "M" and "I"?&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Ronein_0-1721796311709.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/98683iD65CE2BE407EBB23/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Ronein_0-1721796311709.png" alt="Ronein_0-1721796311709.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;I also added to final table percent of bads within each group&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Ronein_0-1721800259819.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/98686i74629CE67309FFB1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Ronein_0-1721800259819.png" alt="Ronein_0-1721800259819.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2024 05:51:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936915#M368177</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-24T05:51:10Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936917#M368178</link>
      <description>&lt;P&gt;run your code on my data on specific variable (Nr Years in bank )&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In your code the best group should be 1 and worse should be 4&lt;/P&gt;
&lt;P&gt;I don't see that Per increasing consistently by moving from group 1 to 2 and 2 to 3 and 3 to 4.&lt;/P&gt;
&lt;P&gt;Is it not part of the requirement??? To have&amp;nbsp;consistently increasing in per (percentage of bads)???&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Ronein_0-1721797826036.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/98685i2239A2D550F061CA/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Ronein_0-1721797826036.png" alt="Ronein_0-1721797826036.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2024 05:12:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936917#M368178</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-24T05:12:04Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936918#M368179</link>
      <description>&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Not always number of groups is 4 .&lt;/P&gt;
&lt;P&gt;I also want that worse group be 1 and not 0.&lt;/P&gt;
&lt;P&gt;Here is the code that do it 100%&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
proc sql noprint;
select count(distinct b) as nr_groups into :nr_groups
from all_group
;
quit;
%put &amp;amp;nr_groups;


data all;
merge all_group temp;
b=(&amp;amp;nr_groups.-b)+1; 
/***I want that worse group be group 1 and so on****/
rename b=group;
Run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 24 Jul 2024 05:17:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936918#M368179</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-07-24T05:17:10Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936922#M368180</link>
      <description>"What is the meaning of values "M" and "I"?"&lt;BR /&gt;It is means negative infinity and positive infinity .&lt;BR /&gt;A.K.A you can map your data  into bins by these cutpoints:&lt;BR /&gt;low-137='1'&lt;BR /&gt;137-144='2'&lt;BR /&gt;144-174='3'&lt;BR /&gt;174-high='4'&lt;BR /&gt;&lt;BR /&gt;"I also added to final table percent of bads within each group"&lt;BR /&gt;And also glad to see you make my code running to get the perfect result . Congratulations!</description>
      <pubDate>Wed, 24 Jul 2024 06:20:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936922#M368180</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-24T06:20:12Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936925#M368182</link>
      <description>And about your time costing problem,&lt;BR /&gt;You could start 10 sas session ,each session for ONE variable,that could save you lots of time.&lt;BR /&gt;P.S. if you could set  "%let n_iter=100" as big as you can to get the better IV.</description>
      <pubDate>Wed, 24 Jul 2024 06:27:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936925#M368182</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-24T06:27:05Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936926#M368183</link>
      <description>"I don't see that Per increasing consistently by moving from group 1 to 2 and 2 to 3 and 3 to 4."&lt;BR /&gt;You should calcualte it on your own, my code did not include it.&lt;BR /&gt;bad_pct=n_bad/n;&lt;BR /&gt;====&amp;gt;&lt;BR /&gt;group bad_pct&lt;BR /&gt;1    0.062&lt;BR /&gt;2   0.036&lt;BR /&gt;3   0.02&lt;BR /&gt;4   0.011</description>
      <pubDate>Wed, 24 Jul 2024 06:32:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936926#M368183</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-24T06:32:16Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936927#M368184</link>
      <description>No need add one more PROC SQL,just use macro variable &amp;amp;group. at top of code.&lt;BR /&gt;&lt;BR /&gt;%let var=weight;  *the continuous variable you need to split;&lt;BR /&gt;%let group=4 ;    *the number of group you want to bin to;&lt;BR /&gt;...........&lt;BR /&gt;&lt;BR /&gt;data all;&lt;BR /&gt;merge all_group temp;&lt;BR /&gt;b=(&amp;amp;group.-b)+1; &lt;BR /&gt;/***I want that worse group be group 1 and so on****/&lt;BR /&gt;rename b=group;&lt;BR /&gt;Run;&lt;BR /&gt;</description>
      <pubDate>Wed, 24 Jul 2024 06:36:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936927#M368184</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-24T06:36:20Z</dc:date>
    </item>
    <item>
      <title>Re: Binning (categorize continuous var into categories)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936928#M368185</link>
      <description>"M" and "I"&lt;BR /&gt;represent two missing value  .M  and .I ,&lt;BR /&gt;.M stands for negative infinity,&lt;BR /&gt;.I stands for positive infinity.</description>
      <pubDate>Wed, 24 Jul 2024 06:38:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Binning-categorize-continuous-var-into-categories/m-p/936928#M368185</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-07-24T06:38:23Z</dc:date>
    </item>
  </channel>
</rss>

