<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Creating buckets for data in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755764#M238552</link>
    <description>&lt;P&gt;I have a sample like this. In the real situation, it will contain more observations per day.&lt;CODE class=" language-sas"&gt;&lt;/CODE&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;data have;input
date id volume; datalines;
20100101 01  11
20100101 02  -5
20100101 03  10
20100101 04  -8
20100101 05  14
20100101 06  1
20100102 01  -17
20100102 02  20
20100102 03  -10
20100102 04  25
20100102 05  -39
20100102 06  20  
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Based on the absolute value of volume, I want to allocate observations into groups with a constant amount of absolute volume. In particular, this constant amount of absolute volume is the one-third of daily absolute volume.&lt;/P&gt;
&lt;P&gt;I apply the following code to estimate the one-third of daily absolute volume.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;data have;
set have;
abs_volume=abs(volume);
run;
proc means data=have noprint;
by date;
var abs_volume;
output out=total sum=daily_volume;
run;
data total;
set total;
criteria_value=floor(daily_volume/3);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;This is tricky part.&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;If the volume is larger than the group size, the extra volume will put into the next group until all of the volume has been categorised.&lt;/P&gt;
&lt;P&gt;The expected result will be following.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;date     id    volume    group
20100101 01    11        1
20100101 02    -5        1
20100101 03    10        2
20100101 04    -6        2     *This one was -8, but now it splits into -6 and -2
20100101 04    -2        3
20100101 05    14        3 
20100101 06    1         4
20100102 01    -17       1
20100102 02    20        1
20100102 03    -6        1     *This one was -10, but now it splits into -6 and -4
20100102 03    -4        2
20100102 04    25        2
20100102 05    -14       2     *This one was -39, but now it splits into -14 and -25
20100102 05    -25       3
20100102 06    18        3     *This one was 20, but now it splits into 18 and 2
20100102 06    2         4
 run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="s-prose js-post-body"&gt;
&lt;P&gt;Can this be achieved by SAS? Many thanks!&lt;/P&gt;
&lt;/DIV&gt;
&lt;DIV class="mt24 mb12"&gt;
&lt;DIV class="post-taglist d-flex gs4 gsy fd-column"&gt;
&lt;DIV class="d-flex ps-relative"&gt;I update a sample as an example. The &lt;EM&gt;time&amp;nbsp;&lt;/EM&gt;variable in the uploaded sample can be viewed the &lt;EM&gt;ID&lt;/EM&gt; in above.&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
    <pubDate>Wed, 21 Jul 2021 23:06:17 GMT</pubDate>
    <dc:creator>Neal0801</dc:creator>
    <dc:date>2021-07-21T23:06:17Z</dc:date>
    <item>
      <title>Creating buckets for data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755764#M238552</link>
      <description>&lt;P&gt;I have a sample like this. In the real situation, it will contain more observations per day.&lt;CODE class=" language-sas"&gt;&lt;/CODE&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;data have;input
date id volume; datalines;
20100101 01  11
20100101 02  -5
20100101 03  10
20100101 04  -8
20100101 05  14
20100101 06  1
20100102 01  -17
20100102 02  20
20100102 03  -10
20100102 04  25
20100102 05  -39
20100102 06  20  
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Based on the absolute value of volume, I want to allocate observations into groups with a constant amount of absolute volume. In particular, this constant amount of absolute volume is the one-third of daily absolute volume.&lt;/P&gt;
&lt;P&gt;I apply the following code to estimate the one-third of daily absolute volume.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;data have;
set have;
abs_volume=abs(volume);
run;
proc means data=have noprint;
by date;
var abs_volume;
output out=total sum=daily_volume;
run;
data total;
set total;
criteria_value=floor(daily_volume/3);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;This is tricky part.&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;If the volume is larger than the group size, the extra volume will put into the next group until all of the volume has been categorised.&lt;/P&gt;
&lt;P&gt;The expected result will be following.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;date     id    volume    group
20100101 01    11        1
20100101 02    -5        1
20100101 03    10        2
20100101 04    -6        2     *This one was -8, but now it splits into -6 and -2
20100101 04    -2        3
20100101 05    14        3 
20100101 06    1         4
20100102 01    -17       1
20100102 02    20        1
20100102 03    -6        1     *This one was -10, but now it splits into -6 and -4
20100102 03    -4        2
20100102 04    25        2
20100102 05    -14       2     *This one was -39, but now it splits into -14 and -25
20100102 05    -25       3
20100102 06    18        3     *This one was 20, but now it splits into 18 and 2
20100102 06    2         4
 run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="s-prose js-post-body"&gt;
&lt;P&gt;Can this be achieved by SAS? Many thanks!&lt;/P&gt;
&lt;/DIV&gt;
&lt;DIV class="mt24 mb12"&gt;
&lt;DIV class="post-taglist d-flex gs4 gsy fd-column"&gt;
&lt;DIV class="d-flex ps-relative"&gt;I update a sample as an example. The &lt;EM&gt;time&amp;nbsp;&lt;/EM&gt;variable in the uploaded sample can be viewed the &lt;EM&gt;ID&lt;/EM&gt; in above.&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Wed, 21 Jul 2021 23:06:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755764#M238552</guid>
      <dc:creator>Neal0801</dc:creator>
      <dc:date>2021-07-21T23:06:17Z</dc:date>
    </item>
    <item>
      <title>Re: Creating buckets for data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755770#M238554</link>
      <description>&lt;P&gt;Have you looked at proc rank?&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jul 2021 22:26:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755770#M238554</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-07-21T22:26:51Z</dc:date>
    </item>
    <item>
      <title>Re: Creating buckets for data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755774#M238558</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hi, this this not just ranked by value. It should also maintain the order of ID.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jul 2021 22:43:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755774#M238558</guid>
      <dc:creator>Neal0801</dc:creator>
      <dc:date>2021-07-21T22:43:44Z</dc:date>
    </item>
    <item>
      <title>Re: Creating buckets for data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755776#M238559</link>
      <description>&lt;P&gt;I see. Sorry I read too quickly. &lt;STRIKE&gt;Why is -17 in group 1?&lt;/STRIKE&gt;&amp;nbsp;[Got it, group by increasing ID] Can you rewrite your comment to use the sample provided instead of 49 observations?&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jul 2021 22:49:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755776#M238559</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-07-21T22:49:45Z</dc:date>
    </item>
    <item>
      <title>Re: Creating buckets for data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755799#M238570</link>
      <description>&lt;P&gt;What if you only have 2 records for a given date?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What if you have 3 records with volumes,&amp;nbsp; &amp;nbsp;1, 1, and 100?&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jul 2021 00:29:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755799#M238570</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2021-07-22T00:29:47Z</dc:date>
    </item>
    <item>
      <title>Re: Creating buckets for data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755802#M238572</link>
      <description>&lt;P&gt;Not sure that I understand what is meant by "with a constant amount of absolute volume. "&lt;/P&gt;
&lt;P&gt;"constant amount" is not a standard statistic. So some rules, not one or two examples, but actual rules how to derive the "constant amount" should be provided.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jul 2021 00:37:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755802#M238572</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-07-22T00:37:50Z</dc:date>
    </item>
    <item>
      <title>Re: Creating buckets for data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755900#M238585</link>
      <description>&lt;P&gt;Like this?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data SIZE(keep=DATE SIZE );
  set HAVE ;
  by DATE;
  if first.DATE then SIZE=0;
  SIZE+abs(VOLUME);
  if last.DATE then output ;
run;

data WANT; 
  set HAVE;
  by DATE;
  if first.DATE then do;
    set SIZE;
    TALLY=0;
    GRP=1;
  end;
  TALLY+abs(VOLUME);        
  LMT=int(SIZE*GRP/3) ;
  if TALLY&amp;gt;LMT then do;
    TMP=VOLUME;
    VOLUME=sign(VOLUME)*(abs(VOLUME)-TALLY+LMT);  
    if VOLUME then output;
    GRP+1;
    VOLUME=TMP-VOLUME;                             
    output;
  end;
  if TALLY=LMT then do;               
    output;
    GRP+1;
  end;
  if TALLY&amp;lt;LMT then output;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="branch"&gt;&lt;A name="IDX" target="_blank"&gt;&lt;/A&gt;
&lt;DIV&gt;
&lt;DIV align="left"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.WANT" width="213px" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;&lt;COLGROUP&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt; &lt;COL /&gt;&lt;/COLGROUP&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="r header" scope="col" width="83px"&gt;DATE&lt;/TH&gt;
&lt;TH class="r header" scope="col" width="40px"&gt;ID&lt;/TH&gt;
&lt;TH class="r header" scope="col" width="49px"&gt;VOLUME&lt;/TH&gt;
&lt;TH class="r header" scope="col" width="40px"&gt;GRP&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100101&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;1&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;11&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100101&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;2&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-5&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100101&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;10&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100101&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;4&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-6&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100101&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;4&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-2&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100101&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;5&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;14&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100101&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;6&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;1&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;1&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-17&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;2&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;20&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-6&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-4&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;4&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;25&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;5&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-15&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;2&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;5&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;-24&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="83px" class="r data"&gt;20100102&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;6&lt;/TD&gt;
&lt;TD width="49px" class="r data"&gt;20&lt;/TD&gt;
&lt;TD width="40px" class="r data"&gt;3&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jul 2021 10:05:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-buckets-for-data/m-p/755900#M238585</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-07-22T10:05:56Z</dc:date>
    </item>
  </channel>
</rss>

