BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PhuongNguyen
Obsidian | Level 7

Hi people,

I want to do a cumulative sum start from a specific point and end at another specific point. In other words, I have a data set below. At any point where column 'edate' gets a value (for example at edate= 21-APR-2011, or 22-JUL-2011 and so on), I want to do a cumulative sum of  column 'ar' within a range of 5 rows before to 5 rows after that edate. This means the cumulative sum will start from the 5th row before one edate, and end at the 5th row after the same edate. And yes I have many different ric to care about at the same time. Can anyone help please? Thank you.


data WORK.HAVE;
infile datalines dsd truncover;
input RIC:$6. Date:DATE11. edate:DATE11. ar:32.;
format Date DATE11. edate DATE11.;
datalines;
AAM.HM 14-APR-2011 . -0.002572834
AAM.HM 15-APR-2011 . -0.00173594
AAM.HM 18-APR-2011 . -0.004409639
AAM.HM 19-APR-2011 . 0.0010875573
AAM.HM 20-APR-2011 . -0.006170841
AAM.HM 21-APR-2011 21-APR-2011 -0.00456025
AAM.HM 22-APR-2011 . 0.0003152569
AAM.HM 25-APR-2011 . 0.0009864579
AAM.HM 26-APR-2011 . 0.0156740199
AAM.HM 27-APR-2011 . -0.021580028
AAM.HM 28-APR-2011 . -0.002922659
AAM.HM 15-JUL-2011 . -0.001935623
AAM.HM 18-JUL-2011 . -0.026651755
AAM.HM 19-JUL-2011 . 0.0351227743
AAM.HM 20-JUL-2011 . -0.005652773
AAM.HM 21-JUL-2011 . 0.0205199523
AAM.HM 22-JUL-2011 22-JUL-2011 -0.001763768
AAM.HM 25-JUL-2011 . 0.0037312991
AAM.HM 26-JUL-2011 . -0.004845419
AAM.HM 27-JUL-2011 . 0.0690293458
AAM.HM 28-JUL-2011 . 0.0049640581
AAM.HM 29-JUL-2011 . -0.00328081
AAM.HM 17-OCT-2011 . 0.0101872501
AAM.HM 18-OCT-2011 . -0.01965987
AAM.HM 19-OCT-2011 . -0.008891549
AAM.HM 20-OCT-2011 . -0.003530292
AAM.HM 21-OCT-2011 . -0.000678125
AAM.HM 24-OCT-2011 24-OCT-2011 -0.000968388
AAM.HM 25-OCT-2011 . 0.0004973143
AAM.HM 26-OCT-2011 . -0.000291778
AAM.HM 27-OCT-2011 . -0.003253837
AAM.HM 28-OCT-2011 . -0.000593415
AAM.HM 31-OCT-2011 . 0.0011569607
AAM.HM 12-JAN-2012 . -0.002383283
AAM.HM 13-JAN-2012 . 0.0193886276
AAM.HM 16-JAN-2012 . 0.0029433443
AAM.HM 17-JAN-2012 . 0.0299181013
AAM.HM 18-JAN-2012 . 0.0015236122
AAM.HM 19-JAN-2012 19-JAN-2012 -0.009540873
AAM.HM 20-JAN-2012 . 0.0047257709
AAM.HM 30-JAN-2012 . 0.0393953847
AAM.HM 31-JAN-2012 . 0.0130199334
AAM.HM 01-FEB-2012 . -0.030948028
AAM.HM 02-FEB-2012 . 0.0105212996
AAM.HM 17-APR-2012 . 0.0122813699
AAM.HM 18-APR-2012 . 0.0190926121
AAM.HM 19-APR-2012 . 0.0065156055
AAM.HM 20-APR-2012 . 0.0019513214
AAM.HM 23-APR-2012 . -0.000155809
AAM.HM 24-APR-2012 24-APR-2012 0.0033912888
;;;;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

A simple two-step method keeping only the cumulative summed records:

 

data edates;
set have;
obs = _n_;
if not missing(edate) then output;
keep RIC obs;
rename RIC=targetRIC;
run;

data cumsums;
set edates;
do point = max(obs-5, 0) to min(obs+5, nobs);
    set have point=point nobs=nobs;
    if RIC = targetRIC then do;
        cumSum = sum(cumSum, ar);
        output;
        end;
    end;
drop targetRIC obs;
run;
PG

View solution in original post

10 REPLIES 10
PGStats
Opal | Level 21

A simple two-step method keeping only the cumulative summed records:

 

data edates;
set have;
obs = _n_;
if not missing(edate) then output;
keep RIC obs;
rename RIC=targetRIC;
run;

data cumsums;
set edates;
do point = max(obs-5, 0) to min(obs+5, nobs);
    set have point=point nobs=nobs;
    if RIC = targetRIC then do;
        cumSum = sum(cumSum, ar);
        output;
        end;
    end;
drop targetRIC obs;
run;
PG
novinosrin
Tourmaline | Level 20

Elegant & exquisite. Very nice!

PhuongNguyen
Obsidian | Level 7

Beautiful. Thank so much @PGStats 

PhuongNguyen
Obsidian | Level 7

Hi, me again.

With the same dataset above, I want to count from 1 at the 5th row before one edate up to 11 at the 5th row after the same edate. And then the count will be repeated (1 to 11) if there is another edate. Please help. Thank you.

PGStats
Opal | Level 21

Only one statement to add:

 

data cumsums;
set edates;
do point = max(obs-5, 1) to min(obs+5, nobs);
    set have point=point nobs=nobs;
    if RIC = targetRIC then do;
        cumSum = sum(cumSum, ar);
        row = point - obs + 6;
        output;
        end;
    end;
drop targetRIC obs;
run;

Note, I also corrected the third statement, because point must be greater than zero.

PG
PhuongNguyen
Obsidian | Level 7

Many thanks to @PGStats

PhuongNguyen
Obsidian | Level 7

Hi @PGStats,

Yes, me with the dataset again.

I am trying to put in groups whenever del_eps has a value and del_eps<0 or del_eps>0. When del_eps's value >0, I want to put them (5 rows before and 5 rows after the del_eps) in group 'A', and group 'B' is for del_eps's value <0.

 

I try to add two more rows into the original codes from yours,

 

if del_eps>0 then gr='A';
if del_eps^='.' and del_eps<0 then gr='B';

 

but they only work from row 6 to row 11, and a blank from row 1 to row 5. Please help again. Thank you so much.

PhuongNguyen
Obsidian | Level 7

Well, I got it. I will call the positive del_eps out to be the targetdel_eps in the 1st step, and do the same with negative del_eps. Solved 🙂

Tom
Super User Tom
Super User

So your sample data has EDATE values every 11 observations.  Will your real data be that neat?

If not what do you want to do when there are overlapping windows?

If the gap between EDATE values is larger than your window what do you want in the output? No record at all?  A record with the cumulative sum missing? or zero?

PhuongNguyen
Obsidian | Level 7

Hi Tom,

No, my real data has a larger gap among edates. I did a mid step to extract this dataset out. @FreelanceReinh and @s_lassen help me on topic SELECT N ROWS BEFORE AND AFTER AN OBSERVATION. They also care about overlap in this step. 

 

If I use my real data I want a missing value in the gap between the two edates. Whatever larger than 11 is not my interest, so it is not very important that what value they have. 

 

Thanks Tom.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 2105 views
  • 3 likes
  • 4 in conversation