<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Cumulative Sum Across Years and Months in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59818#M16892</link>
    <description>I'm trying to find the cumulative sum by development from month to month in this dataset:&lt;BR /&gt;
&lt;BR /&gt;
development_pk year month total&lt;BR /&gt;
272	2005	10	14&lt;BR /&gt;
272	2005	11	18&lt;BR /&gt;
272	2005	12	11&lt;BR /&gt;
272	2006	1	8&lt;BR /&gt;
272	2006	2	3&lt;BR /&gt;
272	2006	3	22</description>
    <pubDate>Wed, 29 Jul 2009 14:04:01 GMT</pubDate>
    <dc:creator>StephenOverton</dc:creator>
    <dc:date>2009-07-29T14:04:01Z</dc:date>
    <item>
      <title>Cumulative Sum Across Years and Months</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59818#M16892</link>
      <description>I'm trying to find the cumulative sum by development from month to month in this dataset:&lt;BR /&gt;
&lt;BR /&gt;
development_pk year month total&lt;BR /&gt;
272	2005	10	14&lt;BR /&gt;
272	2005	11	18&lt;BR /&gt;
272	2005	12	11&lt;BR /&gt;
272	2006	1	8&lt;BR /&gt;
272	2006	2	3&lt;BR /&gt;
272	2006	3	22</description>
      <pubDate>Wed, 29 Jul 2009 14:04:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59818#M16892</guid>
      <dc:creator>StephenOverton</dc:creator>
      <dc:date>2009-07-29T14:04:01Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative Sum Across Years and Months</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59819#M16893</link>
      <description>Forum kept truncating my post due to a less than sign, the '&lt;B&gt;le&lt;/B&gt;' in bold below is actually a less than or equal.&lt;BR /&gt;
&lt;BR /&gt;
This is my proc sql so far:&lt;BR /&gt;
&lt;BR /&gt;
create table BilledUnitsCuSum as&lt;BR /&gt;
  select t1.*, sum(t2.total) as CuSum&lt;BR /&gt;
  from BilledUnitsFixed as t1&lt;BR /&gt;
    join BilledUnitsFixed as t2 on t2.development_pk = t1.development_pk and t2.year = t1.year and t2.month &lt;B&gt;le&lt;/B&gt; t1.month&lt;BR /&gt;
  group by t1.development_pk, t1.year, t1.month, t1.total&lt;BR /&gt;
  order by development_pk, year, month&lt;BR /&gt;
&lt;BR /&gt;
But as you can see from this output, the cumulative sum resets after each year because the month resets back to 1:&lt;BR /&gt;
&lt;BR /&gt;
development_pk year month total&lt;BR /&gt;
272	2005	10	14	59&lt;BR /&gt;
272	2005	11	18	77&lt;BR /&gt;
272	2005	12	11	88&lt;BR /&gt;
272	2006	1	8	8&lt;BR /&gt;
272	2006	2	3	11&lt;BR /&gt;
272	2006	3	22	33&lt;BR /&gt;
&lt;BR /&gt;
How can I take advantage of using the _N_ variable (I think this is the observation/row number) to use in my join clause instead of the month??&lt;BR /&gt;
&lt;BR /&gt;
Thanks!</description>
      <pubDate>Wed, 29 Jul 2009 14:21:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59819#M16893</guid>
      <dc:creator>StephenOverton</dc:creator>
      <dc:date>2009-07-29T14:21:37Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative Sum Across Years and Months</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59820#M16894</link>
      <description>Disregard, I figured it out.&lt;BR /&gt;
&lt;BR /&gt;
I created a new column in the previous data step called ID:&lt;BR /&gt;
&lt;BR /&gt;
ID = _n_;&lt;BR /&gt;
&lt;BR /&gt;
Then here is the updated SQL (changing the 'le' to an actual sign):&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
  create table BilledUnitsCuSum as&lt;BR /&gt;
    select t1.development_pk, t1.year, t1.month, t1.total, sum(t2.total) as CuSum&lt;BR /&gt;
	from BilledUnitsFixed as t1&lt;BR /&gt;
      join BilledUnitsFixed as t2 on t2.development_pk = t1.development_pk and t2.ID &lt;B&gt;le&lt;/B&gt; t1.ID&lt;BR /&gt;
	group by t1.development_pk, t1.year, t1.month, t1.total&lt;BR /&gt;
	order by development_pk, year, month&lt;BR /&gt;
	;&lt;BR /&gt;
quit;

Stupid less than sign messing up....&lt;BR /&gt;
&lt;BR /&gt;
    &lt;BR /&gt;
Message was edited by: SteveONCSU</description>
      <pubDate>Wed, 29 Jul 2009 15:17:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59820#M16894</guid>
      <dc:creator>StephenOverton</dc:creator>
      <dc:date>2009-07-29T15:17:09Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative Sum Across Years and Months</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59821#M16895</link>
      <description>Hi:&lt;BR /&gt;
  Just for comparison purposes, the DATA step program that would create a cumulative total, by DEVELOPMENT_PK would look like this (assumes your dataset is sorted or ordered by DEVELOPMENT_PK):&lt;BR /&gt;
[pre]&lt;BR /&gt;
data BilledUnitsCuSum ;&lt;BR /&gt;
  set BilledUnitsFixed ;&lt;BR /&gt;
  by development_pk;&lt;BR /&gt;
  retain CuSum;&lt;BR /&gt;
  if first.development_pk then CuSum= 0;&lt;BR /&gt;
  CuSum + total;&lt;BR /&gt;
run;&lt;BR /&gt;
                                                      &lt;BR /&gt;
proc print data=BilledUnitsCuSum;&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
                                                                                 &lt;BR /&gt;
The program breaks down like this:&lt;BR /&gt;
DATA statement: creates new dataset &lt;BR /&gt;
SET statement:  identifies the input dataset&lt;BR /&gt;
BY statement: turns on by group processing using the variable listed&lt;BR /&gt;
RETAIN statement: explicitly lists a variable whose value should be                      retained or "remembered" across each iteration of the data step&lt;BR /&gt;
IF statement: tests whether an observation contains the first development_pk for a BY group and if true, resets the CuSum variable value to 0&lt;BR /&gt;
SUM statement: accumulates CuSum by adding the value of TOTAL to it. (Note that the keyword "SUM" does not appear on this statement. This form of an assignment statement is known as a SUM statement in the documentation. Do not confuse this statement with the SUM function.)&lt;BR /&gt;
RUN statement: ends the program by providing a step boundary&lt;BR /&gt;
                                 &lt;BR /&gt;
After this program runs, every observation in the dataset would now have a new variable called CuSum, which would be the cumulative total amount for that development_pk only.&lt;BR /&gt;
    &lt;BR /&gt;
One advantage of this approach is that it does not require a self-join because the RETAIN statement retains the value of the CuSum variable across iterations of the DATA step program. This means that until CuSum is reset, the value of TOTAL will keep getting added to CuSum. One feature of BY group processing is that, inside the DATA step, you can use FIRST.byvar and LAST.byvar automatic variables to test whether the input buffer holds the first or last observation in a BY group.&lt;BR /&gt;
&lt;BR /&gt;
The program is explicitly setting CuSum to 0 at the first observation for every DEVELOPMENT_PK, because the IF statement is testing for the occurence of FIRST.DEVELOOPMENT_PK = 1. The automatic variable FIRST.DEVELOPMENT_PK will be equal to 1 at the first observation and equal to 0 on the other observations for the by group. The shorthand or Boolean version of the IF statement:&lt;BR /&gt;
if first.development_pk then.... &lt;BR /&gt;
is the same as coding&lt;BR /&gt;
if first.development_pk = 1 then ....&lt;BR /&gt;
 &lt;BR /&gt;
If you don't have many observations, then either method would probably be OK. But, if you have a LOT of observations, then you might want to benchmark for performance. Since a join is not involved you might find the DATA step performs better for larger data sets.&lt;BR /&gt;
 &lt;BR /&gt;
cynthia</description>
      <pubDate>Wed, 29 Jul 2009 16:00:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59821#M16895</guid>
      <dc:creator>Cynthia_sas</dc:creator>
      <dc:date>2009-07-29T16:00:04Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative Sum Across Years and Months</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59822#M16896</link>
      <description>Thanks for the improvement, I think this will be much more scalable in the future.</description>
      <pubDate>Wed, 29 Jul 2009 16:02:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Cumulative-Sum-Across-Years-and-Months/m-p/59822#M16896</guid>
      <dc:creator>StephenOverton</dc:creator>
      <dc:date>2009-07-29T16:02:48Z</dc:date>
    </item>
  </channel>
</rss>

