<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Data rollup - sum previous rows if criteria are met in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462270#M117668</link>
    <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/118477"&gt;@dolldata&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;Assuming all your data is in SAS tables here a data step option which should perform much better.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want2(drop=_:);
  if _n_=1 then
    do;
      if 0 then set have(keep=customer order_date demand rename=(order_date=_order_date demand=_demand));
      dcl hash h1(dataset:'have(keep=customer order_date demand rename=(order_date=_order_date demand=_demand))', multidata:'y');
      h1.defineKey('customer');
      h1.defineData('_order_date','_demand');
      h1.defineDone();
    end;

  set have;

  demand366_calc=0;
  demand732_calc=0;
  do while(h1.do_over() eq 0);
    if order_date &amp;gt; _order_date then
      do;
        if _order_date &amp;gt;= order_date-366 then demand366_calc=sum(demand366_calc,_demand);
        if _order_date &amp;gt;= order_date-732 then demand732_calc=sum(demand732_calc,_demand);
      end;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This option loads all the rows and required variables from your source into a hash lookup table in memory. 7.4 million records should fit but in case you encounter out-of-memory conditions: There would be less resource hungry alternatives/extensions - but they&amp;nbsp;would require additional code logic (=harder to understand and maintain).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Change 16/05/2018:&lt;/P&gt;
&lt;P&gt;Code above fixed based on&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/118477"&gt;@dolldata&lt;/a&gt;'s comment (replace source table name WANT2 with HAVE).&lt;/P&gt;</description>
    <pubDate>Tue, 15 May 2018 19:20:15 GMT</pubDate>
    <dc:creator>Patrick</dc:creator>
    <dc:date>2018-05-15T19:20:15Z</dc:date>
    <item>
      <title>Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/461447#M117393</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to summarize customer behavior to determine on each&amp;nbsp;one of their&amp;nbsp;orders if they were a new customer on that order or an existing customer.&amp;nbsp; To make things more difficult, I want to break the existing customers into more specific groups, based on if they had purchased within a year before or not.&amp;nbsp; Furthermore, if they had purchased in the previous year, break them apart by how much they spent in that previous year.&amp;nbsp; I have a dataset of customers and all of their orders, and want to leave it at the order level (one line per order, not one line per customer).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For an example household I already have the first four columns, but want to calculate the last two columns (demand366 and demand732) to be able to then classify existing customers.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;customer&lt;/TD&gt;&lt;TD&gt;first_order&lt;/TD&gt;&lt;TD&gt;order_date&lt;/TD&gt;&lt;TD&gt;demand&lt;/TD&gt;&lt;TD&gt;demand366&lt;/TD&gt;&lt;TD&gt;demand732&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;$36.25&lt;/TD&gt;&lt;TD&gt;$0.00&lt;/TD&gt;&lt;TD&gt;$0.00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;03/18/15&lt;/TD&gt;&lt;TD&gt;$50.00&lt;/TD&gt;&lt;TD&gt;$36.25&lt;/TD&gt;&lt;TD&gt;$36.25&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;11/11/15&lt;/TD&gt;&lt;TD&gt;$126.00&lt;/TD&gt;&lt;TD&gt;$86.25&lt;/TD&gt;&lt;TD&gt;$86.25&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;02/06/16&lt;/TD&gt;&lt;TD&gt;$72.00&lt;/TD&gt;&lt;TD&gt;$212.25&lt;/TD&gt;&lt;TD&gt;$212.25&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;10/20/16&lt;/TD&gt;&lt;TD&gt;$186.95&lt;/TD&gt;&lt;TD&gt;$198.00&lt;/TD&gt;&lt;TD&gt;$284.25&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;10/15/17&lt;/TD&gt;&lt;TD&gt;$45.00&lt;/TD&gt;&lt;TD&gt;$186.95&lt;/TD&gt;&lt;TD&gt;$384.95&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;03/15/15&lt;/TD&gt;&lt;TD&gt;01/27/18&lt;/TD&gt;&lt;TD&gt;$92.00&lt;/TD&gt;&lt;TD&gt;$45.00&lt;/TD&gt;&lt;TD&gt;$303.95&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So, for customer 123, on their first order (3/15/15), they would have $0 in demand for the 12 months (366 days) and 24 months (732 days) PRIOR to that first order.&amp;nbsp; On their next order, 3/18/15, they would have spent $36.25 PRIOR to that order (both for 12 and 24 months).&amp;nbsp; To get more complicated, if you look at the order on 10/20/16, only the prior two orders were within 12 months, so the demand366 total would be $198.00 ($126 + $72), but the 24 month total would include all four prior orders ($36.25 + $50 + $126 + $72 = $284.25).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm struggling on how to get SAS to calculate this, and if do loops or first./last. or something within proc sql would work best...or some combination of all of those methods.&amp;nbsp; I know I am over-complicating this in my mind...I feel there should be a fairly easy way to do this?&amp;nbsp; Any suggestions would be greatly appreciated!&lt;/P&gt;</description>
      <pubDate>Thu, 10 May 2018 23:20:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/461447#M117393</guid>
      <dc:creator>dolldata</dc:creator>
      <dc:date>2018-05-10T23:20:58Z</dc:date>
    </item>
    <item>
      <title>Re: Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/461451#M117397</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/118477"&gt;@dolldata&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;Please post in the future your sample data in the form of a working SAS data step so we can spend time on answering your question instead of reading your data into a SAS dataset.&lt;BR /&gt;Below SQL should do what you're asking for.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
  create table want as
    select 
      customer,   
      first_order,
      order_date, 
      demand,     
      demand366,  
      demand732,
      coalesce((select sum(i.demand) from have as I where i.customer=o.customer and i.order_date between o.order_date-1 and o.order_date-366),0) as demand366_calc,
      coalesce((select sum(i.demand) from have as I where i.customer=o.customer and i.order_date between o.order_date-1 and o.order_date-732),0) as demand732_calc
    from have as O
    ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 11 May 2018 00:08:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/461451#M117397</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-05-11T00:08:13Z</dc:date>
    </item>
    <item>
      <title>Re: Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/461461#M117403</link>
      <description>Apologies for not putting the data in a step - that was a big miss. But thank you for this solution - it works perfectly! I had not heard of the coalesce function, but will definitely be researching it more.</description>
      <pubDate>Fri, 11 May 2018 03:26:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/461461#M117403</guid>
      <dc:creator>dolldata</dc:creator>
      <dc:date>2018-05-11T03:26:16Z</dc:date>
    </item>
    <item>
      <title>Re: Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462258#M117658</link>
      <description>&lt;P&gt;I hate to add another reply to the string, especially after accepting the solution, but is there a way to speed up the calculations?&amp;nbsp; I am running this on about 7.4 million records, summarizing the activity of 974K customers (we are running SAS Studio release 3.6, with the processing on a server).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I realize it should take a long time, as it is doing so many calculations on each customer, but at the current rate, it will take about three days to get the processing done.&amp;nbsp; Which may just be the way it is.&amp;nbsp; Just curious if there are ideas on how to optimize that?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 May 2018 05:04:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462258#M117658</guid>
      <dc:creator>dolldata</dc:creator>
      <dc:date>2018-05-15T05:04:54Z</dc:date>
    </item>
    <item>
      <title>Re: Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462262#M117662</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/118477"&gt;@dolldata&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;The SQL code posted is certainly not the most performant option. Is your source table in SAS or in a database (and if database: Which one?).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 May 2018 05:34:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462262#M117662</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-05-15T05:34:55Z</dc:date>
    </item>
    <item>
      <title>Re: Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462270#M117668</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/118477"&gt;@dolldata&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;Assuming all your data is in SAS tables here a data step option which should perform much better.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want2(drop=_:);
  if _n_=1 then
    do;
      if 0 then set have(keep=customer order_date demand rename=(order_date=_order_date demand=_demand));
      dcl hash h1(dataset:'have(keep=customer order_date demand rename=(order_date=_order_date demand=_demand))', multidata:'y');
      h1.defineKey('customer');
      h1.defineData('_order_date','_demand');
      h1.defineDone();
    end;

  set have;

  demand366_calc=0;
  demand732_calc=0;
  do while(h1.do_over() eq 0);
    if order_date &amp;gt; _order_date then
      do;
        if _order_date &amp;gt;= order_date-366 then demand366_calc=sum(demand366_calc,_demand);
        if _order_date &amp;gt;= order_date-732 then demand732_calc=sum(demand732_calc,_demand);
      end;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This option loads all the rows and required variables from your source into a hash lookup table in memory. 7.4 million records should fit but in case you encounter out-of-memory conditions: There would be less resource hungry alternatives/extensions - but they&amp;nbsp;would require additional code logic (=harder to understand and maintain).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Change 16/05/2018:&lt;/P&gt;
&lt;P&gt;Code above fixed based on&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/118477"&gt;@dolldata&lt;/a&gt;'s comment (replace source table name WANT2 with HAVE).&lt;/P&gt;</description>
      <pubDate>Tue, 15 May 2018 19:20:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462270#M117668</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-05-15T19:20:15Z</dc:date>
    </item>
    <item>
      <title>Re: Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462360#M117710</link>
      <description>&lt;P&gt;I am working with SAS datasets.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I kept getting an error that the first instance of the&amp;nbsp;'want2' table didn't exist:&lt;/P&gt;&lt;DIV class="sasSource"&gt;&lt;DIV class="sasSource"&gt;&lt;DIV class="sasSource"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt; 78         data want2(drop=_:);
 79         
 80           if _n_=1 then
 81             do;
 82               if 0 then set want2(keep=customer order_date demand rename=(order_date=_order_date demand=_demand));
 ERROR: File WORK.WANT2.DATA does not exist.
 83               dcl hash h1(dataset:'want2(keep=customer order_date demand rename=(order_date=_order_date demand=_demand))',
 83       ! multidata:'y');
 84               h1.defineKey('customer');
 85               h1.defineData('_order_date','_demand');
 86               h1.defineDone();
 87             end;
 88         
 89           set have;
 90         
 91           demand366_calc=0;
 92           demand732_calc=0;
 93           do while(h1.do_over() eq 0);
 94             if order_date &amp;gt; _order_date then
 95               do;
 96                 if _order_date &amp;gt;= order_date-366 then demand366_calc=sum(demand366_calc,_demand);
 97                 if _order_date &amp;gt;= order_date-732 then demand732_calc=sum(demand732_calc,_demand);
 98               end;
 99           end;
 100        run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After trial and error, replacing 'want2' with 'have' within the datastep made it work:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want2(drop=_:);

  if _n_=1 then
    do;
      if 0 then set have(keep=customer order_date demand rename=(order_date=_order_date demand=_demand));
      dcl hash h1(dataset:'have(keep=customer order_date demand rename=(order_date=_order_date demand=_demand))', multidata:'y');
      h1.defineKey('customer');
      h1.defineData('_order_date','_demand');
      h1.defineDone();
    end;

  set have;

  demand366_calc=0;
  demand732_calc=0;
  do while(h1.do_over() eq 0);
    if order_date &amp;gt; _order_date then
      do;
        if _order_date &amp;gt;= order_date-366 then demand366_calc=sum(demand366_calc,_demand);
        if _order_date &amp;gt;= order_date-732 then demand732_calc=sum(demand732_calc,_demand);
      end;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have spot-checked several records and everything appears to be calculating correctly.&amp;nbsp; This code is incredible -&amp;nbsp;processes all 7.4 million records in just 15 seconds.&amp;nbsp; I definitely need to spend more time understanding hash processing!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you again for the help!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 May 2018 14:06:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462360#M117710</guid>
      <dc:creator>dolldata</dc:creator>
      <dc:date>2018-05-15T14:06:41Z</dc:date>
    </item>
    <item>
      <title>Re: Data rollup - sum previous rows if criteria are met</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462476#M117749</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/118477"&gt;@dolldata&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;You've passed the test (just joking!). But great that you've identified and fixed the issue yourself&amp;nbsp;&lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt; and that things are working in your environment.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I suggest that if you still can mark the hash code as solution instead of the SQL code.&lt;/P&gt;</description>
      <pubDate>Tue, 15 May 2018 19:34:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-rollup-sum-previous-rows-if-criteria-are-met/m-p/462476#M117749</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-05-15T19:34:11Z</dc:date>
    </item>
  </channel>
</rss>

