<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Bucketing data according to proportion of a variable in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568899#M160243</link>
    <description>Yes, I need my data to be in order of dollar amount (from lowest to highest) and then divided into groups that each have the same sum of total dollars. The article you linked splits data into groups with the same mean which is not what I want. Can you please help?</description>
    <pubDate>Tue, 25 Jun 2019 18:38:08 GMT</pubDate>
    <dc:creator>benvar</dc:creator>
    <dc:date>2019-06-25T18:38:08Z</dc:date>
    <item>
      <title>Bucketing data according to proportion of a variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568767#M160181</link>
      <description>&lt;P&gt;I have data of purchases with different payment amounts.&amp;nbsp; I want to bucket the data so that each bucket has an equal proportion of total dollars.&amp;nbsp; I am experimenting with PROC HPBIN but I don't know which bucketing method to use or if I need to use a different PROC.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;Example of data&lt;/U&gt;:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Purchase #&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Dollar Amount&lt;/P&gt;&lt;P&gt;1&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$10&lt;/P&gt;&lt;P&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$20&lt;/P&gt;&lt;P&gt;3&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$10&lt;/P&gt;&lt;P&gt;4&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$10&lt;/P&gt;&lt;P&gt;5&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$50&lt;/P&gt;&lt;P&gt;6&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$30&lt;/P&gt;&lt;P&gt;7&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$10&lt;/P&gt;&lt;P&gt;8&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$20&lt;/P&gt;&lt;P&gt;9&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$10&lt;/P&gt;&lt;P&gt;10&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$20&lt;/P&gt;&lt;P&gt;11&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;$10&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;Code should result in buckets like this&lt;/STRONG&gt;&lt;/EM&gt;:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Bucket 1: $0-$10: total of $50 (Purchases #1, 3, 4, 7, 9)&lt;/P&gt;&lt;P&gt;Bucket 2: $10-$20: total of $50 (Purchases #11, 2, &lt;span class="lia-unicode-emoji" title=":smiling_face_with_sunglasses:"&gt;😎&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Bucket 3: $20-$30: total of $50 (Purchases #10, 6)&lt;/P&gt;&lt;P&gt;Bucket 4: More than $30: total of $50 (Purchase #5)&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jun 2019 13:48:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568767#M160181</guid>
      <dc:creator>benvar</dc:creator>
      <dc:date>2019-06-25T13:48:30Z</dc:date>
    </item>
    <item>
      <title>Re: Bucketing data according to proportion of a variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568878#M160233</link>
      <description>&lt;P&gt;I think that most people would not consider this "binning." Binning usually means assigning similar units to groups: For example, you could bin by the dollar amount and get one bin that has small amounts, another that has medium amounts, and one that has large amounts.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;From your description, it sounds like you want to&amp;nbsp;split the data into groups that have (approximately) sum (or proportion) of dollars. If that is correct, you might want to read the article &lt;A href="https://blogs.sas.com/content/iml/2017/05/01/split-data-groups-mean-variance.html" target="_self"&gt;"Split data into groups that have the same mean "&amp;nbsp;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;See also the comments at the end of the article for an alternative solution.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jun 2019 17:59:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568878#M160233</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2019-06-25T17:59:19Z</dc:date>
    </item>
    <item>
      <title>Re: Bucketing data according to proportion of a variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568899#M160243</link>
      <description>Yes, I need my data to be in order of dollar amount (from lowest to highest) and then divided into groups that each have the same sum of total dollars. The article you linked splits data into groups with the same mean which is not what I want. Can you please help?</description>
      <pubDate>Tue, 25 Jun 2019 18:38:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568899#M160243</guid>
      <dc:creator>benvar</dc:creator>
      <dc:date>2019-06-25T18:38:08Z</dc:date>
    </item>
    <item>
      <title>Re: Bucketing data according to proportion of a variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568937#M160254</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/277499"&gt;@benvar&lt;/a&gt;&amp;nbsp;:&lt;/P&gt;
&lt;P&gt;Your presumption that what you need to do is possible is based on the assumption that the data are actually structured in such a way that as you go through the amounts in ascending order, first N1 items will tally up to 50, next N2 items will also tally up&amp;nbsp; to 50, and so on. Your sample data shown here are in fact structured in this manner. If your real data also are, then try this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                         
  input purchase amount ;           
  cards ;                           
 1  10                              
 2  20                              
 3  10                              
 4  10                              
 5  50                              
 6  30                              
 7  10                              
 8  20                              
 9  10                              
10  20                              
11  10                              
run ;                               
                                    
proc sort ;                         
  by amount ;                       
run ;                               
                                    
data want (drop = _:) ;             
  do _q = 1 by 1 until (_sum = 50) ;
    set have ;                      
    _sum = sum (_sum, amount) ;     
  end ;                             
  bucket = _n_ ;                    
  do _q = 1 to _q ;                 
    set have ;                      
    output ;                        
  end ;                             
run ;                               
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jun 2019 21:46:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/568937#M160254</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-06-25T21:46:58Z</dc:date>
    </item>
    <item>
      <title>Re: Bucketing data according to proportion of a variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/569058#M160307</link>
      <description>&lt;P&gt;As Paul says, there are assumptions that may or may not hold. Here's a program that enables the cumulative sums to exceed the target values (50).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=Have out=HaveSort;
by Amount;
run;

/* manually inspect the sum */
proc means data=Have Sum;
var Amount;
run;

%let nGroups = 4;
%let SUM = 200;    /* &amp;lt;== input sum from PROC MEANS */

data Cumul;
retain Bin 1;
set HaveSort;
d = &amp;amp;SUM / &amp;amp;nGroups;
cusum + Amount;
diff = cusum - d;
if diff &amp;gt; 0 then do;
   cusum = cusum - d;
   Bin + 1;
end;
run;

proc print; run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The code won't work perfectly if there are very large amounts like $200.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jun 2019 12:18:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Bucketing-data-according-to-proportion-of-a-variable/m-p/569058#M160307</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2019-06-26T12:18:31Z</dc:date>
    </item>
  </channel>
</rss>

