I have a dataset that has a Year column and a Death column. I want to create 3 year average deaths column (where the first and last years just have the single year's deaths).
For example:
year | deaths | 3 year average deaths |
2009 | 6 | 6 |
2010 | 17 | 8.666666667 |
2011 | 3 | 16.33333333 |
2012 | 29 | 25 |
2013 | 43 | 29 |
2014 | 15 | 21.66666667 |
2015 | 7 | 8.666666667 |
2016 | 4 | 7.333333333 |
2017 | 11 | 9.333333333 |
2018 | 13 | 15.33333333 |
2019 | 22 | 21.66666667 |
2020 | 30 | 30 |
So the entry for the 3 year average deaths for 2010 is (6+17+3)/3=8.666666667.
How would I do this in SAS?
Thanks in advance!
If you have SAS/ETS, which contains PROC EXPAND, you can use the solution provided by @sbxkoenk.
Otherwise a little data step programming is needed, with code to include the preceding, current, and subsequent death counts.
data work.deaths;
informat year $4.;
input year $ deaths ;
date=mdy(12,31,input(year,4.));
format date date9.;
cards;
2009 6
2010 17
2011 3
2012 29
2013 43
2014 15
2015 7
2016 4
2017 11
2018 13
2019 22
2020 30
run;
data want (drop=_:);
merge deaths
deaths (firstobs=2 keep=deaths rename=(deaths=_nxt_deaths))
end=lastrec;
centered_moving_average=mean(lag(deaths),deaths,_nxt_deaths);
if _n_=1 or lastrec=1 then centered_moving_average=deaths;
run;
Hello,
What you are after is a centered moving average.
You can do that with the CMOVAVE transformation operation in PROC EXPAND (you need SAS/ETS).
data work.deaths;
informat year $4.;
input year $ deaths three3_year_average_deaths;
date=mdy(12,31,input(year,4.));
format date date9.;
cards;
2009 6 6
2010 17 8.666666667
2011 3 16.33333333
2012 29 25
2013 43 29
2014 15 21.66666667
2015 7 8.666666667
2016 4 7.333333333
2017 11 9.333333333
2018 13 15.33333333
2019 22 21.66666667
2020 30 30
;
run;
proc expand data=work.deaths out=work.death_exp method=none;
id date;
convert deaths = deaths_exp / transformout=(cmovave 3);
run;
/* end of program */
Cheers,
Koen
If you have SAS/ETS, which contains PROC EXPAND, you can use the solution provided by @sbxkoenk.
Otherwise a little data step programming is needed, with code to include the preceding, current, and subsequent death counts.
data work.deaths;
informat year $4.;
input year $ deaths ;
date=mdy(12,31,input(year,4.));
format date date9.;
cards;
2009 6
2010 17
2011 3
2012 29
2013 43
2014 15
2015 7
2016 4
2017 11
2018 13
2019 22
2020 30
run;
data want (drop=_:);
merge deaths
deaths (firstobs=2 keep=deaths rename=(deaths=_nxt_deaths))
end=lastrec;
centered_moving_average=mean(lag(deaths),deaths,_nxt_deaths);
if _n_=1 or lastrec=1 then centered_moving_average=deaths;
run;
Thank you, this worked!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.