If a var is missing for a given year (say var{2016} is missing), then the mean(var{2015},var{2017}) would generate either
The actual mean if both var{2015} and var{2017} are valid values or
The closest value if only one of var{2015} or var{2017} is a valid value. or
Otherwise miss.
If you have case 3 (otherwise missing), then take the mean of var{2014} and var{2018}, i.e. expand the range by one year in each direction.
Using that principle, then this code could work:
%let min_year=2014;
%let max_year=2019;
%let span=%eval(&max_year-&min_year);
%let array_lower_bound=%eval(&min_year-&span);
%let array_upper_bound=%eval(&max_year+&span);
%put _user_;
data have;
infile cards truncover;
input Year State a b c;
date=mdy(1, 1, year);
format date year4.;
cards;
2014 1 0.2 . 0.6
2015 1 . . .
2016 1 0.5 0.6 .
2017 1 0.5 0.6 0.3
2018 1 0.6 . .
2019 1 0.3 . 0.7
2014 2 . . 2
2015 2
2016 2
2017 2 . 0.4 .
2018 2 . 0.5 .
2019 2
2014 3
2015 3
2016 3
2017 3
2018 3
2019 3 0.9 0.8 0.7
run;
data want (drop=v span);
set have (in=firstpass) have (in=secondpass) ;
by state;
array suspect_var {*} a b c ;
array ah{&array_lower_bound:&array_upper_bound,3} _temporary_; /* Actual history */
if first.state then call missing(of ah{*});
if firstpass then do v=1 to dim(suspect_var); /* Build the actual history array*/
ah{year,v}=suspect_var{v};
end;
if secondpass;
do v=1 to dim(suspect_var);
do span=0 to &span while (suspect_var{v}=.);
if n(ah{year-span,v},ah{year+span,v})>0 then suspect_var{v}=mean(ah{year-span,v},ah{year+span,v});
end;
end;
run;
Notice the array lower and upper bounds go beyond the actual earliest and latest years. All those pre-study and post-study years will simply have missing values for each variable. This allows a simplistic expansion of the span used to generate means without looking beyond the array bounds.
... View more