Hi all,
I have a programming question regarding the calculation of a rolling sum for a range of variables. To simplify the question I have only included two variables to be calculated. The sum must be calculated for each customer represented in the data.
DATA = work.ydelser
PERSON ID DATE TOTAL PROD_A ...
A 08-01-2014 10 ...
A 13-01-2014 12 ...
A 23-01-2014 11
A 12-02-2014 17
B 07-01-2014 26
B 17-01-2014 50
B 25-02-2014 10
B 03-03-2014 10
The result I would like to have is:
DATA = work.ydel_sum
PERSON ID DATE TOTALsum PROD_A
A 08-01-2014 10 ...
A 13-01-2014 22
A 23-01-2014 33
A 12-02-2014 22
B 07-01-2014 26
B 17-01-2014 76
B 25-02-2014 10
B 03-03-2014 20
The computed dataset does not have to include the original variable. The date is in "date format".
I will use the code in a code-program in Enterprise Guide, but if anyone knows how to solve the question by Enterprise functions, please let me know.
I have seen a privious topic on this matter, but I cannot figure out how to use the "cards" function.
Thanks in advance!
Here is code which will do a rolling 30 day sum for you. The datalines or cards is just used to create the test data (I have left in to show you). I do not understand why the fourth A row in your example out data is 22 though, that doesn't make sense with the logic.
data have; informat date ddmmyy10.; format date date9.; input person_id $ date total; datalines; A 08-01-2014 10 A 13-01-2014 12 A 23-01-2014 11 A 12-02-2014 17 ; run; proc sql; create table WANT as select PERSON_ID, DATE, TOTAL, sum(TMP) as ROLLING_SUM from ( select A.*, B.TOTAL as TMP from HAVE A left join HAVE B on A.PERSON_ID=B.PERSON_ID and A.DATE-30 <= B.DATE <= A.DATE ) group by PERSON_ID, DATE, TOTAL; quit;
What this does is join all records where date is within 30 days to the original data, this is the subquery. That data is then grouped by the original variables and a sum calculated .
Hope this helps :) data ydelser; input PERSONID$ TOTAL; datalines; A 10 A 12 A 14 A 17 B 30 B 30 B 40 B 50 ; Data newdata; set ydelser; by PERSONID notsorted; if first.PERSONID then TOTALsum = 0; TOTALsum+TOTAL; run; Proc Print data = newdata; run;
Thanks!
I've forgot to mention that I have hundreds of rows. Do I have so specify all in the datalines?
Here is code which will do a rolling 30 day sum for you. The datalines or cards is just used to create the test data (I have left in to show you). I do not understand why the fourth A row in your example out data is 22 though, that doesn't make sense with the logic.
data have; informat date ddmmyy10.; format date date9.; input person_id $ date total; datalines; A 08-01-2014 10 A 13-01-2014 12 A 23-01-2014 11 A 12-02-2014 17 ; run; proc sql; create table WANT as select PERSON_ID, DATE, TOTAL, sum(TMP) as ROLLING_SUM from ( select A.*, B.TOTAL as TMP from HAVE A left join HAVE B on A.PERSON_ID=B.PERSON_ID and A.DATE-30 <= B.DATE <= A.DATE ) group by PERSON_ID, DATE, TOTAL; quit;
What this does is join all records where date is within 30 days to the original data, this is the subquery. That data is then grouped by the original variables and a sum calculated .
You are right, the fourth row should have had the value 40. Sorry for the mistake.
This solution is pretty much the same as @pearsoninst provided; however, it includes the DATE variable and will keep only the last 30 days from the current date. If you want the last 30 days of a different date than the current, just substitute whatever date you want in place of the today().
data ydelser;
input PERSON_ID$ DATE ddmmyy10. TOTAL;
datalines;
A 08-01-2014 10
A 13-01-2014 12
A 23-01-2014 14
A 12-02-2014 17
B 07-01-2014 30
B 17-01-2014 30
B 25-02-2014 40
B 03-03-2014 50
;
run;
proc sort data=ydelser;
by PERSON_ID;
run;
data ydel_sum;
set ydelser;
by PERSON_ID;
Where DATE>=intnx("DAY",today(),-30);
If First.PERSON_ID then TOTALsum=0;
TOTALsum+TOTAL;
run;
Thank you for all your replies! Both solutions worked, and I have accepted the one that I used as the solution.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.