I have a dataset that has a Year column and a Death column. I want to create 3 year average deaths column (where the first and last years just have the single year's deaths).
For example:
year | deaths | 3 year average deaths |
2009 | 6 | 6 |
2010 | 17 | 8.666666667 |
2011 | 3 | 16.33333333 |
2012 | 29 | 25 |
2013 | 43 | 29 |
2014 | 15 | 21.66666667 |
2015 | 7 | 8.666666667 |
2016 | 4 | 7.333333333 |
2017 | 11 | 9.333333333 |
2018 | 13 | 15.33333333 |
2019 | 22 | 21.66666667 |
2020 | 30 | 30 |
So the entry for the 3 year average deaths for 2010 is (6+17+3)/3=8.666666667.
How would I do this in SAS?
Thanks in advance!
If you have SAS/ETS, which contains PROC EXPAND, you can use the solution provided by @sbxkoenk.
Otherwise a little data step programming is needed, with code to include the preceding, current, and subsequent death counts.
data work.deaths;
informat year $4.;
input year $ deaths ;
date=mdy(12,31,input(year,4.));
format date date9.;
cards;
2009 6
2010 17
2011 3
2012 29
2013 43
2014 15
2015 7
2016 4
2017 11
2018 13
2019 22
2020 30
run;
data want (drop=_:);
merge deaths
deaths (firstobs=2 keep=deaths rename=(deaths=_nxt_deaths))
end=lastrec;
centered_moving_average=mean(lag(deaths),deaths,_nxt_deaths);
if _n_=1 or lastrec=1 then centered_moving_average=deaths;
run;
Hello,
What you are after is a centered moving average.
You can do that with the CMOVAVE transformation operation in PROC EXPAND (you need SAS/ETS).
data work.deaths;
informat year $4.;
input year $ deaths three3_year_average_deaths;
date=mdy(12,31,input(year,4.));
format date date9.;
cards;
2009 6 6
2010 17 8.666666667
2011 3 16.33333333
2012 29 25
2013 43 29
2014 15 21.66666667
2015 7 8.666666667
2016 4 7.333333333
2017 11 9.333333333
2018 13 15.33333333
2019 22 21.66666667
2020 30 30
;
run;
proc expand data=work.deaths out=work.death_exp method=none;
id date;
convert deaths = deaths_exp / transformout=(cmovave 3);
run;
/* end of program */
Cheers,
Koen
If you have SAS/ETS, which contains PROC EXPAND, you can use the solution provided by @sbxkoenk.
Otherwise a little data step programming is needed, with code to include the preceding, current, and subsequent death counts.
data work.deaths;
informat year $4.;
input year $ deaths ;
date=mdy(12,31,input(year,4.));
format date date9.;
cards;
2009 6
2010 17
2011 3
2012 29
2013 43
2014 15
2015 7
2016 4
2017 11
2018 13
2019 22
2020 30
run;
data want (drop=_:);
merge deaths
deaths (firstobs=2 keep=deaths rename=(deaths=_nxt_deaths))
end=lastrec;
centered_moving_average=mean(lag(deaths),deaths,_nxt_deaths);
if _n_=1 or lastrec=1 then centered_moving_average=deaths;
run;
Thank you, this worked!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.