BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aopsb
Calcite | Level 5

Hi all,

 

I have a programming question regarding the calculation of a rolling sum for a range of variables. To simplify the question I have only included two variables to be calculated. The sum must be calculated for each customer represented in the data.

 

DATA = work.ydelser

 

PERSON ID         DATE                 TOTAL      PROD_A          ...

A                          08-01-2014         10                 ...     

A                          13-01-2014         12                 ...

A                          23-01-2014         11

A                          12-02-2014         17

B                          07-01-2014         26

B                          17-01-2014         50

B                          25-02-2014         10

B                          03-03-2014         10

 

The result I would like to have is:

 

DATA = work.ydel_sum

 

PERSON ID         DATE                 TOTALsum        PROD_A

A                          08-01-2014         10                         ...

A                          13-01-2014         22

A                          23-01-2014         33

A                          12-02-2014         22

B                          07-01-2014         26

B                          17-01-2014         76

B                          25-02-2014         10

B                          03-03-2014         20

 

The computed dataset does not have to include the original variable. The date is in "date format".

 

I will use the code in a code-program in Enterprise Guide, but if anyone knows how to solve the question by Enterprise functions, please let me know.

 

I have seen a privious topic on this matter, but I cannot figure out how to use the "cards" function.

 

Thanks in advance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Here is code which will do a rolling 30 day sum for you.  The datalines or cards is just used to create the test data (I have left in to show you).  I do not understand why the fourth A row in your example out data is 22 though, that doesn't make sense with the logic.

data have;
  informat date ddmmyy10.;
  format date date9.;
  input person_id $ date total;
datalines;
A 08-01-2014 10                     
A 13-01-2014 12                 
A 23-01-2014 11
A 12-02-2014 17
;
run;

proc sql;
  create table WANT as
  select  PERSON_ID,
          DATE,
          TOTAL,
          sum(TMP) as ROLLING_SUM
  from    (
            select  A.*,
                    B.TOTAL as TMP
            from    HAVE A
            left join HAVE B
            on      A.PERSON_ID=B.PERSON_ID
            and     A.DATE-30 <= B.DATE <= A.DATE
          )
  group by PERSON_ID,
           DATE,
           TOTAL;
quit;

What this does is join all records where date is within 30 days to the original data, this is the subquery.  That data is then grouped by the original variables and a sum calculated .

View solution in original post

7 REPLIES 7
pearsoninst
Pyrite | Level 9
Hope this helps :) 

data ydelser;
input PERSONID$ TOTAL;
datalines;
A 10
A 12
A 14
A 17
B 30
B 30
B 40
B 50
;
Data newdata;
set ydelser;
by PERSONID notsorted;
if first.PERSONID then TOTALsum = 0;
TOTALsum+TOTAL;
run;
Proc Print data = newdata;
run;
aopsb
Calcite | Level 5

Thanks!

 

I've forgot to mention that I have hundreds of rows. Do I have so specify all in the datalines?

pearsoninst
Pyrite | Level 9
If you need only PERSONID$ TOTAL and the new variable "TOTALsum" then you don't have to ..
You use Proc Import to import the file ..so you dont have to write anything datelines .

Here is the link ..for Proc Import

https://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000332605.htm
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Here is code which will do a rolling 30 day sum for you.  The datalines or cards is just used to create the test data (I have left in to show you).  I do not understand why the fourth A row in your example out data is 22 though, that doesn't make sense with the logic.

data have;
  informat date ddmmyy10.;
  format date date9.;
  input person_id $ date total;
datalines;
A 08-01-2014 10                     
A 13-01-2014 12                 
A 23-01-2014 11
A 12-02-2014 17
;
run;

proc sql;
  create table WANT as
  select  PERSON_ID,
          DATE,
          TOTAL,
          sum(TMP) as ROLLING_SUM
  from    (
            select  A.*,
                    B.TOTAL as TMP
            from    HAVE A
            left join HAVE B
            on      A.PERSON_ID=B.PERSON_ID
            and     A.DATE-30 <= B.DATE <= A.DATE
          )
  group by PERSON_ID,
           DATE,
           TOTAL;
quit;

What this does is join all records where date is within 30 days to the original data, this is the subquery.  That data is then grouped by the original variables and a sum calculated .

aopsb
Calcite | Level 5

You are right, the fourth row should have had the value 40. Sorry for the mistake.

dcruik
Lapis Lazuli | Level 10

This solution is pretty much the same as @pearsoninst provided; however, it includes the DATE variable and will keep only the last 30 days from the current date.  If you want the last 30 days of a different date than the current, just substitute whatever date you want in place of the today().

 

data ydelser;
input PERSON_ID$ DATE ddmmyy10. TOTAL;
datalines;
A 08-01-2014 10
A 13-01-2014 12
A 23-01-2014 14
A 12-02-2014 17
B 07-01-2014 30
B 17-01-2014 30
B 25-02-2014 40
B 03-03-2014 50
;
run;

proc sort data=ydelser;
by PERSON_ID;
run;

data ydel_sum;
set ydelser;
by PERSON_ID;
Where DATE>=intnx("DAY",today(),-30);
If First.PERSON_ID then TOTALsum=0;
TOTALsum+TOTAL;
run;
aopsb
Calcite | Level 5

Thank you for all your replies! Both solutions worked, and I have accepted the one that I used as the solution.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 12704 views
  • 1 like
  • 4 in conversation