I'd like to know how to do 10 day roll up summary. Thanks.
date | count | 10day sum |
1/1/2012 | 2 | |
1/2/2012 | 4 | |
1/3/2012 | 3 | |
1/4/2012 | 6 | |
1/5/2012 | 7 | |
1/6/2012 | 9 | |
1/7/2012 | 65 | |
1/8/2012 | 4 | |
1/9/2012 | 8 | |
1/10/2012 | 56 | 164 |
1/11/2012 | 4 | 166 |
1/12/2012 | 5 | 167 |
1/13/2012 | 4 | 168 |
1/14/2012 | 2 | 164 |
1/15/2012 | 2 | 159 |
1/16/2012 | 3 | 153 |
1/17/2012 | 4 | 92 |
1/18/2012 | 3 | 91 |
1/19/2012 | 2 | 85 |
1/20/2012 | 4 | 33 |
1/21/2012 | 5 | 34 |
1/22/2012 | 3 | 32 |
1/23/2012 | 5 | 33 |
1/24/2012 | 8 | 39 |
1/25/2012 | 9 | 46 |
1/26/2012 | 0 | 43 |
1/27/2012 | 7 | 46 |
1/28/2012 | 6 | 49 |
1/29/2012 | 6 | 53 |
1/30/2012 | 45 | 94 |
data foo;
input date mmddyy10. count;
cards;
1/1/2012 2
1/2/2012 4
1/3/2012 3
1/4/2012 6
1/5/2012 7
1/6/2012 9
1/7/2012 65
1/8/2012 4
1/9/2012 8
1/10/2012 56
1/11/2012 4
1/12/2012 5
1/13/2012 4
1/14/2012 2
1/15/2012 2
1/16/2012 3
1/17/2012 4
1/18/2012 3
1/19/2012 2
1/20/2012 4
1/21/2012 5
1/22/2012 3
1/23/2012 5
1/24/2012 8
1/25/2012 9
1/26/2012 0
1/27/2012 7
1/28/2012 6
1/29/2012 6
1/30/2012 45
;
run;
data bar;
if _n_=1 then
do i=1 to 9;
set foo;
call missing(sum);
output;
end;
sum=0;
set foo (firstobs=10);
do i=1 to 10;
p=(_n_-1)+i;
set foo point=p;
sum+count;
end;
output;
drop i;
run;
date | count | sum |
18993 | 2 | . |
18994 | 4 | . |
18995 | 3 | . |
18996 | 6 | . |
18997 | 7 | . |
18998 | 9 | . |
18999 | 65 | . |
19000 | 4 | . |
19001 | 8 | . |
19002 | 56 | 164 |
19003 | 4 | 166 |
19004 | 5 | 167 |
19005 | 4 | 168 |
19006 | 2 | 164 |
19007 | 2 | 159 |
19008 | 3 | 153 |
19009 | 4 | 92 |
19010 | 3 | 91 |
19011 | 2 | 85 |
19012 | 4 | 33 |
19013 | 5 | 34 |
19014 | 3 | 32 |
19015 | 5 | 33 |
19016 | 8 | 39 |
19017 | 9 | 46 |
19018 | 0 | 43 |
19019 | 7 | 46 |
19020 | 6 | 49 |
19021 | 6 | 53 |
19022 | 45 | 94 |
Here we are, again. without involving ETS, in addition to FriedEgg's solution, here is another approach (credit to Howard, Ksharp):
data have;
infile cards truncover;
input date :mmddyy10. count;
format date mmddyy10.;
cards;
1/1/2012 2
1/2/2012 4
1/3/2012 3
1/4/2012 6
1/5/2012 7
1/6/2012 9
1/7/2012 65
1/8/2012 4
1/9/2012 8
1/10/2012 56
1/11/2012 4
1/12/2012 5
1/13/2012 4
1/14/2012 2
1/15/2012 2
1/16/2012 3
1/17/2012 4
1/18/2012 3
1/19/2012 2
1/20/2012 4
1/21/2012 5
1/22/2012 3
1/23/2012 5
1/24/2012 8
1/25/2012 9
1/26/2012 0
1/27/2012 7
1/28/2012 6
1/29/2012 6
1/30/2012 45
;
data want;
array v(0:9) _temporary_;
set have;
v(mod(_n_,10))=count;
if _n_>=10 then day_sum=sum( of v(*));
run;
proc print;run;
Haikuo
Beat me to it...and that code is definitely slick and concise. If it is too slick:
data want;
set have;
array sumz {10};
retain sumz rowcnt 0;
rowcnt +1;
sumz{rowcnt}=count;
if _n_ >=10 then do;
tendaycount=sum(of sumz1-sumz10);
if _n_ in (10,20,30)then rowcnt=0;
end;
keep date1 count tendaycount;
run;
I have a SQL solution instead of it.Not need to pre-sort dataset.
data foo; input date mmddyy10. count; format date mmddyy10.; cards; 1/1/2012 2 1/2/2012 4 1/3/2012 3 1/4/2012 6 1/5/2012 7 1/6/2012 9 1/7/2012 65 1/8/2012 4 1/9/2012 8 1/10/2012 56 1/11/2012 4 1/12/2012 5 1/13/2012 4 1/14/2012 2 1/15/2012 2 1/16/2012 3 1/17/2012 4 1/18/2012 3 1/19/2012 2 1/20/2012 4 1/21/2012 5 1/22/2012 3 1/23/2012 5 1/24/2012 8 1/25/2012 9 1/26/2012 0 1/27/2012 7 1/28/2012 6 1/29/2012 6 1/30/2012 45 ; run; proc sql; create table want as select date,count,case when date lt min(date)+9 then . else (select sum(count) from foo as a where a.date between b.date-9 and b.date) end as sum from foo as b; quit;
Ksharp
Liked! SQL is indeed a powerful tool!
Haikuo
Of course, as a rule, SQL is the only way to do this properly (duck :smileysilly:) as in the bullet proof :
proc sql;
create table want as
select A.date, A.count, ifn(count(distinct B.date)=10, sum(B.count), .) as ten_day_sum
from foo as A left join foo as B
on B.date between intnx("DAY", A.date, -9, "Beginning") and A.date
group by A.date, A.count;
select * from want;
quit;
PG
using macro and lag function:
%macro lag_sum(var,n);
%let lag_sum=lag(&var);
%do i=2 %to &n;
%let lag_sum=&lag_sum,lag&i(&var);
%end;
sum_&var&n=sum(&lag_sum);
if _n_ < &n then call missing(sum_&var&n);
%mend;
data bar;
set foo;
%lag_sum(count,10);
run;
I like Haikuo's FIFO approach but, if you are concerned about 10 day sums and don't really have all dates, then you might need to expand the code a little. E.g.:
%let roll=10;
data want;
array v{0:%eval(&roll.-1)} _temporary_;
array d{0:%eval(&roll.-1)} _temporary_;
set have;
v{mod(_n_,&roll.)}=count;
d{mod(_n_,&roll.)}=date;
do i=0 to 9;
if not missing(d{i}) then do;
if date-d{i} gt %eval(&roll.-1) then do;
call missing(v{i});
call missing(d{i});
end;
end;
end;
if _n_>=&roll. then day_sum=sum( of v{*});
run;
I agree with your point Arthur. Solutions that depend on sorted values can be guarantied to work but if you rely on complete data, that's more difficult to enforce. So better be safe than sorry... Moreover, the overhead cost can be minimal (I removed one data line to show the effect) :
data have;
infile cards truncover;
input date :mmddyy10. count;
format date mmddyy10.;
cards;
1/1/2012 2
1/2/2012 4
1/3/2012 3
1/4/2012 6
1/5/2012 7
1/6/2012 9
1/7/2012 65
1/8/2012 4
1/9/2012 8
1/10/2012 56
1/11/2012 4
1/12/2012 5
1/13/2012 4
1/14/2012 2
1/16/2012 3
1/17/2012 4
1/18/2012 3
1/19/2012 2
1/20/2012 4
1/21/2012 5
1/22/2012 3
1/23/2012 5
1/24/2012 8
1/25/2012 9
1/26/2012 0
1/27/2012 7
1/28/2012 6
1/29/2012 6
1/30/2012 45
;
%let wnd=10;
data want;
array d{0:0} _temporary_;
array s{0:&wnd.} _temporary_;
set have;
d{0} = 1 + coalesce(d{0}, date-1);
do while (d{0} < date);
s{mod(d{0}, &wnd)} = .; /* creates a hole when sum is incomplete (preferred)*/
*s{mod(d{0}, &wnd)} = 0; /* accepts incomplete sums */
d{0} + 1;
end;
s{mod(date, &wnd)} = count;
if nmiss(of s{*}) = 1 then day_sum = sum(of s{*});
else day_sum = .;
run;
proc print data=want; run;
PG
It doesn't get any easier than this:
data want (drop=_:);
set have;
retain _cum 0;
_cum + ( count - coalesce( lag10( count ),0 ));
if _n_ <= 10 then cum_sum = .;
else cum_sum = _cum;
run;
Each iteration of the data step adds to the cumulative sum the new value for count and subtracts the 10th lag of count.
Nice!
if _n_ <= 10 then cum_sum = .;
should be
if _n_ < 10 then cum_sum = .;
Good catch. Thanks.
hi ... here's a tweak of SAS_Bigot's nice solution ...
data want (drop=_cum);
set have;
_cum + sum(count, lag10(-count));
cum_sum = ifn(_n_ lt 10, . , _cum);
run;
Awesome! So far we have Array, Lag(), SQL, Macro, DOW, and let's see how far this great thread can lead us to!
Haikuo
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.