Assuming your data has been sorted by id and year.
data HAVE; input ID YEAR VAL; cards; 1. 1990. 0.24 2. 2003. 0.13 2. 2004. 0.22 2. 2005. 0.26 2. 2006. 0.1 2. 2007 0.95 5. 1998. 0.2 5. 1999. 0.33 5. 2000. 0.42 5 2001. 0.01 5. 2002. 0.42 5. 2004. 0.54 5. 2005. 0.83 5. 2006. 0.13 5. 2007. 0.25 5. 2011. 0.98 5. 2012. 0.4 5. 2016. 0.32 5. 2018. 0.15 ; run; data want; if _n_=1 then do; k=.;v=.; declare hash h(); h.definekey('k'); h.definedata('v'); h.definedone(); end; do until(last.id); set have; by id; h.add(key:year,data:val); n=0;sum=0; do k=year-10 to year; if h.find()=0 then do;n+1;sum+v; end; end; if n>3 then m_avg=sum/n; output; end; h.clear(); drop k v n sum; run;
Like this?
data HAVE;
input ID YEAR VAL;
cards;
1. 1990. 0.24
2. 2003. 0.13
2. 2004. 0.22
2. 2005. 0.26
2. 2006. 0.1
2. 2007 0.95
5. 1998. 0.2
5. 1999. 0.33
5. 2000. 0.42
5 2001. 0.01
5. 2002. 0.42
5. 2004. 0.54
5. 2005. 0.83
5. 2006. 0.13
5. 2007. 0.25
5. 2011. 0.98
5. 2012. 0.4
5. 2016. 0.32
5. 2018. 0.15
run;
proc sql;
select unique a.*
, ifn(count(b.YEAR) > 3, mean(b.VAL), .) as AVG
from HAVE a
left join
HAVE b
on a.ID = b.ID
and b.YEAR between a.YEAR-10 and a.YEAR
group by 1, 2
order by 1, 2;
quit;
ID | YEAR | VAL | AVG |
---|---|---|---|
1 | 1990 | 0.24 | . |
2 | 2003 | 0.13 | . |
2 | 2004 | 0.22 | . |
2 | 2005 | 0.26 | . |
2 | 2006 | 0.1 | 0.1775 |
2 | 2007 | 0.95 | 0.332 |
5 | 1998 | 0.2 | . |
5 | 1999 | 0.33 | . |
5 | 2000 | 0.42 | . |
5 | 2001 | 0.01 | 0.24 |
5 | 2002 | 0.42 | 0.276 |
5 | 2004 | 0.54 | 0.32 |
5 | 2005 | 0.83 | 0.392857 |
5 | 2006 | 0.13 | 0.36 |
5 | 2007 | 0.25 | 0.347778 |
5 | 2011 | 0.98 | 0.451429 |
5 | 2012 | 0.4 | 0.507143 |
5 | 2016 | 0.32 | 0.416 |
5 | 2018 | 0.15 | 0.4625 |
Assuming your data has been sorted by id and year.
data HAVE; input ID YEAR VAL; cards; 1. 1990. 0.24 2. 2003. 0.13 2. 2004. 0.22 2. 2005. 0.26 2. 2006. 0.1 2. 2007 0.95 5. 1998. 0.2 5. 1999. 0.33 5. 2000. 0.42 5 2001. 0.01 5. 2002. 0.42 5. 2004. 0.54 5. 2005. 0.83 5. 2006. 0.13 5. 2007. 0.25 5. 2011. 0.98 5. 2012. 0.4 5. 2016. 0.32 5. 2018. 0.15 ; run; data want; if _n_=1 then do; k=.;v=.; declare hash h(); h.definekey('k'); h.definedata('v'); h.definedone(); end; do until(last.id); set have; by id; h.add(key:year,data:val); n=0;sum=0; do k=year-10 to year; if h.find()=0 then do;n+1;sum+v; end; end; if n>3 then m_avg=sum/n; output; end; h.clear(); drop k v n sum; run;
Not sure about speed but you should be able to understand this code and easily modify it.
data HAVE;
infile cards dsd truncover;
input ID YEAR VAL;
cards;
1, 1990, 0.24
2, 2003, 0.13
2, 2004, 0.22
2, 2005, 0.26
2, 2006, 0.1
2, 2007, 0.95
5, 1998, 0.2
5, 1999, 0.33
5, 2000, 0.42
5, 2001, 0.01
5, 2002, 0.42
5, 2004, 0.54
5, 2005, 0.83
5, 2006, 0.13
5, 2007, 0.25
5, 2011, 0.98
5, 2012, 0.4
5, 2016, 0.32
5, 2018, 0.15
;
;
;
run;
*get first and last year for each ID to create skeleton table;
proc sql;
create table years as select ID, min(year) as start_year, max(year) as
end_year from have group by ID order by 1;
quit;
*build out values for each year;
data skeleton;
set years;
do year=start_year to end_year;
output;
end;
run;
proc sort data=have;
by id year;
run;
*merge in main table and calculate moving average at the same time;
data want;
merge skeleton have;
by ID year;
array _val(0:9) _temporary_;
_val{mod(_n_,10)} = val;
if first.ID then
call missing(of _val{*});
if n(of _val(*))>=4 then
avg=mean(of _val(*));
drop start_year end_year;
run;
@KS99 wrote:
Hi, thank you in advance.
I want to calculate moving averages over the preceding 10 years.
If 10 years are not fully supplied, I can calculate based on at least 4 obs. over the preceding 10 years (Gaps in years do not matter).
So, if the below data are given:
ID. YEAR. VALS
1. 1990. 0.24
2. 2003. 0.13
2. 2004. 0.22
2. 2005. 0.26
2. 2006. 0.1
2. 2007 0.95
5. 1998. 0.2
5. 1999. 0.33
5. 2000. 0.42
5 2001. 0.01
5. 2002. 0.42
5. 2004. 0.54
5. 2005. 0.83
5. 2006. 0.13
5. 2007. 0.25
5. 2011. 0.98
5. 2012. 0.4
5. 2016. 0.32
5. 2018. 0.15
My desired output is as follows:
1. 1990. 0.24. .
2. 2003. 0.13. .
2. 2004. 0.22. .
2. 2005. 0.26. .
2. 2006. 0.1. 0.18
2. 2007. 0.95. 0.33
5. 1998. 0.2. .
5. 1999. 0.33. .
5. 2000. 0.42. .
5. 2001 0.01. 0.24
5. 2002. 0.42. 0.28
5. 2004. 0.54. 0.32
5. 2005 0.83. 0.39
5. 2006 0.13. 0.36
5. 2007 0.25. 0.35
5. 2011 0.98. 0.45
5. 2012. 0.4. 0.50
5. 2016 0.32. 0.42
5. 2018 0.15. 0.46
For ID 2 Year 2006, the moving average is over 2003~2006.
For ID 2 Year 2007, the moving average is over 2003~2007.
In the case of ID 5, the values for 2001~2007 are the averages starting from 1998, ending in the year of interest.
The value for 2011 is the average over 2001~2011.
The value for 2016, for example, is the average over 2006~2016.
I was helped by SAS communities for many times in the past. I appreciate your help in advance!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.