I have a dataset like this for each ID;
data have;
input Months ID Number; 2018-07-01 1 0 2018-08-01 1 0 2018-09-01 1 1 2018-10-01 1 3 2018-11-01 1 1 2018-12-01 1 2 2019-01-01 1 0 2019-02-01 1 0 2019-03-01 1 1 2019-04-01 1 0 2019-05-01 1 0 2019-06-01 1 0 2019-07-01 1 1 2019-08-01 1 0 2019-09-01 1 0 2019-10-01 1 2 2019-11-01 1 0 2019-12-01 1 0 2020-01-01 1 0 2020-02-01 1 0 2020-03-01 1 0 2020-04-01 1 0 2020-05-01 1 0 2020-06-01 1 0 2020-07-01 1 0 2020-08-01 1 1 2020-09-01 1 0 2020-10-01 1 0 2020-11-01 1 1 2020-12-01 1 0 2021-01-01 1 0 2021-02-01 1 1 2021-03-01 1 1 2021-04-01 1 0 2018-07-01 2 0 ....... ....... .......
(Similar values for each ID)
I want a dataset like this;
data want;
input Months ID Number Sum_Next_6Numbers;
2018-07-01 1 0 7
2018-08-01 1 0 7
2018-09-01 1 1 7
2018-10-01 1 3 4
2018-11-01 1 1 3
2018-12-01 1 2 1
2019-01-01 1 0 2
2019-02-01 1 0 2
2019-03-01 1 1 1
2019-04-01 1 0 3
2019-05-01 1 0 3
2019-06-01 1 0 3
2019-07-01 1 1 2
2019-08-01 1 0 2
2019-09-01 1 0 2
2019-10-01 1 2 0
2019-11-01 1 0 0
2019-12-01 1 0 0
2020-01-01 1 0 0
2020-02-01 1 0 1
2020-03-01 1 0 1
2020-04-01 1 0 1
2020-05-01 1 0 2
2020-06-01 1 0 2
2020-07-01 1 0 2
2020-08-01 1 1 2
2020-09-01 1 0 3
2020-10-01 1 0 3
2020-11-01 1 1 Nan
2020-12-01 1 0 Nan
2021-01-01 1 0 Nan
2021-02-01 1 1 Nan
2021-03-01 1 1 Nan
2021-04-01 1 0 Nan
2018-07-01 2 0 0
....... ....... .......
I want to sum up next 6 row's values(for number column). If there is no 6 months left then this values should be Nan.
e. g. for first row(2018-07-01) want to sum next 6 rows (2018-08-01,2018-09-01,2018-10-01,2018-11-01,2018-12-01,2019-01-01) of Number column.
Is there a way to do this? Thank you in advance.
Otherwise, here is an approach.
I changed the have data set to show that it handles by-groups.
data have;
input Months : yymmdd10. ID Number;
format Months yymmdd10.;
datalines;
2018-07-01 1 0
2018-08-01 1 0
2018-09-01 1 1
2018-10-01 1 3
2018-11-01 1 1
2018-12-01 1 2
2019-01-01 1 0
2019-02-01 1 0
2019-03-01 1 1
2019-04-01 1 0
2019-05-01 1 0
2019-06-01 1 0
2019-07-01 1 1
2019-08-01 1 0
2019-09-01 1 0
2019-10-01 1 2
2019-11-01 1 0
2019-12-01 1 0
2020-01-01 1 0
2020-02-01 1 0
2020-03-01 1 0
2020-04-01 1 0
2020-05-01 1 0
2020-06-01 1 0
2020-07-01 1 0
2020-08-01 1 1
2020-09-01 1 0
2020-10-01 1 0
2020-11-01 1 1
2020-12-01 1 0
2021-01-01 1 0
2021-02-01 1 1
2021-03-01 1 1
2021-04-01 1 0
2018-07-01 2 0
2018-08-01 2 0
2018-09-01 2 1
2018-10-01 2 3
2018-11-01 2 1
2018-12-01 2 2
2019-01-01 2 0
2019-02-01 2 0
2019-03-01 2 1
2019-04-01 2 0
2019-05-01 2 0
2019-06-01 2 0
2019-07-01 2 1
2019-08-01 2 0
2019-09-01 2 0
2019-10-01 2 2
2019-11-01 2 0
2019-12-01 2 0
2020-01-01 2 0
2020-02-01 2 0
2020-03-01 2 0
2020-04-01 2 0
2020-05-01 2 0
2020-06-01 2 0
2020-07-01 2 0
2020-08-01 2 1
2020-09-01 2 0
2020-10-01 2 0
2020-11-01 2 1
2020-12-01 2 0
2021-01-01 2 0
2021-02-01 2 1
2021-03-01 2 1
2021-04-01 2 0
;
data want(drop = i n);
set have curobs = c nobs = nobs;
Sum_Next_6Numbers = 0;
do p = c + 1 to 6 + c;
if p > nobs then do;
Sum_Next_6Numbers = .; leave;
end;
set have(keep = Number ID rename = (Number = n id = i)) point = p;
if id ne i then do;
Sum_Next_6Numbers = .; leave;
end;
Sum_Next_6Numbers + n;
end;
run;
Result:
Obs Months ID Number Sum_Next_6Numbers 1 2018-07-01 1 0 7 2 2018-08-01 1 0 7 3 2018-09-01 1 1 7 4 2018-10-01 1 3 4 5 2018-11-01 1 1 3 6 2018-12-01 1 2 1 7 2019-01-01 1 0 2 8 2019-02-01 1 0 2 9 2019-03-01 1 1 1 10 2019-04-01 1 0 3 11 2019-05-01 1 0 3 12 2019-06-01 1 0 3 13 2019-07-01 1 1 2 14 2019-08-01 1 0 2 15 2019-09-01 1 0 2 16 2019-10-01 1 2 0 17 2019-11-01 1 0 0 18 2019-12-01 1 0 0 19 2020-01-01 1 0 0 20 2020-02-01 1 0 1 21 2020-03-01 1 0 1 22 2020-04-01 1 0 1 23 2020-05-01 1 0 2 24 2020-06-01 1 0 2 25 2020-07-01 1 0 2 26 2020-08-01 1 1 2 27 2020-09-01 1 0 3 28 2020-10-01 1 0 3 29 2020-11-01 1 1 . 30 2020-12-01 1 0 . 31 2021-01-01 1 0 . 32 2021-02-01 1 1 . 33 2021-03-01 1 1 . 34 2021-04-01 1 0 . 35 2018-07-01 2 0 7 36 2018-08-01 2 0 7 37 2018-09-01 2 1 7 38 2018-10-01 2 3 4 39 2018-11-01 2 1 3 40 2018-12-01 2 2 1 41 2019-01-01 2 0 2 42 2019-02-01 2 0 2 43 2019-03-01 2 1 1 44 2019-04-01 2 0 3 45 2019-05-01 2 0 3 46 2019-06-01 2 0 3 47 2019-07-01 2 1 2 48 2019-08-01 2 0 2 49 2019-09-01 2 0 2 50 2019-10-01 2 2 0 51 2019-11-01 2 0 0 52 2019-12-01 2 0 0 53 2020-01-01 2 0 0 54 2020-02-01 2 0 1 55 2020-03-01 2 0 1 56 2020-04-01 2 0 1 57 2020-05-01 2 0 2 58 2020-06-01 2 0 2 59 2020-07-01 2 0 2 60 2020-08-01 2 1 2 61 2020-09-01 2 0 3 62 2020-10-01 2 0 3 63 2020-11-01 2 1 . 64 2020-12-01 2 0 . 65 2021-01-01 2 0 . 66 2021-02-01 2 1 . 67 2021-03-01 2 1 . 68 2021-04-01 2 0 .
You can sort the dataset by ID and also within ID in reverse order (by descending) "months" and then use the LAG function to get the previous 6 values and sum them up whilst checking if you are in the same value of ID in the LAG as in the current record.
There is a paper on this topic which discusses this issue and offers also a different approach without the LAG function: https://support.sas.com/resources/papers/proceedings16/11221-2016.pdf
Do you have SAS/ETS?
Otherwise, here is an approach.
I changed the have data set to show that it handles by-groups.
data have;
input Months : yymmdd10. ID Number;
format Months yymmdd10.;
datalines;
2018-07-01 1 0
2018-08-01 1 0
2018-09-01 1 1
2018-10-01 1 3
2018-11-01 1 1
2018-12-01 1 2
2019-01-01 1 0
2019-02-01 1 0
2019-03-01 1 1
2019-04-01 1 0
2019-05-01 1 0
2019-06-01 1 0
2019-07-01 1 1
2019-08-01 1 0
2019-09-01 1 0
2019-10-01 1 2
2019-11-01 1 0
2019-12-01 1 0
2020-01-01 1 0
2020-02-01 1 0
2020-03-01 1 0
2020-04-01 1 0
2020-05-01 1 0
2020-06-01 1 0
2020-07-01 1 0
2020-08-01 1 1
2020-09-01 1 0
2020-10-01 1 0
2020-11-01 1 1
2020-12-01 1 0
2021-01-01 1 0
2021-02-01 1 1
2021-03-01 1 1
2021-04-01 1 0
2018-07-01 2 0
2018-08-01 2 0
2018-09-01 2 1
2018-10-01 2 3
2018-11-01 2 1
2018-12-01 2 2
2019-01-01 2 0
2019-02-01 2 0
2019-03-01 2 1
2019-04-01 2 0
2019-05-01 2 0
2019-06-01 2 0
2019-07-01 2 1
2019-08-01 2 0
2019-09-01 2 0
2019-10-01 2 2
2019-11-01 2 0
2019-12-01 2 0
2020-01-01 2 0
2020-02-01 2 0
2020-03-01 2 0
2020-04-01 2 0
2020-05-01 2 0
2020-06-01 2 0
2020-07-01 2 0
2020-08-01 2 1
2020-09-01 2 0
2020-10-01 2 0
2020-11-01 2 1
2020-12-01 2 0
2021-01-01 2 0
2021-02-01 2 1
2021-03-01 2 1
2021-04-01 2 0
;
data want(drop = i n);
set have curobs = c nobs = nobs;
Sum_Next_6Numbers = 0;
do p = c + 1 to 6 + c;
if p > nobs then do;
Sum_Next_6Numbers = .; leave;
end;
set have(keep = Number ID rename = (Number = n id = i)) point = p;
if id ne i then do;
Sum_Next_6Numbers = .; leave;
end;
Sum_Next_6Numbers + n;
end;
run;
Result:
Obs Months ID Number Sum_Next_6Numbers 1 2018-07-01 1 0 7 2 2018-08-01 1 0 7 3 2018-09-01 1 1 7 4 2018-10-01 1 3 4 5 2018-11-01 1 1 3 6 2018-12-01 1 2 1 7 2019-01-01 1 0 2 8 2019-02-01 1 0 2 9 2019-03-01 1 1 1 10 2019-04-01 1 0 3 11 2019-05-01 1 0 3 12 2019-06-01 1 0 3 13 2019-07-01 1 1 2 14 2019-08-01 1 0 2 15 2019-09-01 1 0 2 16 2019-10-01 1 2 0 17 2019-11-01 1 0 0 18 2019-12-01 1 0 0 19 2020-01-01 1 0 0 20 2020-02-01 1 0 1 21 2020-03-01 1 0 1 22 2020-04-01 1 0 1 23 2020-05-01 1 0 2 24 2020-06-01 1 0 2 25 2020-07-01 1 0 2 26 2020-08-01 1 1 2 27 2020-09-01 1 0 3 28 2020-10-01 1 0 3 29 2020-11-01 1 1 . 30 2020-12-01 1 0 . 31 2021-01-01 1 0 . 32 2021-02-01 1 1 . 33 2021-03-01 1 1 . 34 2021-04-01 1 0 . 35 2018-07-01 2 0 7 36 2018-08-01 2 0 7 37 2018-09-01 2 1 7 38 2018-10-01 2 3 4 39 2018-11-01 2 1 3 40 2018-12-01 2 2 1 41 2019-01-01 2 0 2 42 2019-02-01 2 0 2 43 2019-03-01 2 1 1 44 2019-04-01 2 0 3 45 2019-05-01 2 0 3 46 2019-06-01 2 0 3 47 2019-07-01 2 1 2 48 2019-08-01 2 0 2 49 2019-09-01 2 0 2 50 2019-10-01 2 2 0 51 2019-11-01 2 0 0 52 2019-12-01 2 0 0 53 2020-01-01 2 0 0 54 2020-02-01 2 0 1 55 2020-03-01 2 0 1 56 2020-04-01 2 0 1 57 2020-05-01 2 0 2 58 2020-06-01 2 0 2 59 2020-07-01 2 0 2 60 2020-08-01 2 1 2 61 2020-09-01 2 0 3 62 2020-10-01 2 0 3 63 2020-11-01 2 1 . 64 2020-12-01 2 0 . 65 2021-01-01 2 0 . 66 2021-02-01 2 1 . 67 2021-03-01 2 1 . 68 2021-04-01 2 0 .
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.