Hello, I'd like some help filling out missing values. I have a dataset with daily values of temperature and has some missing values. I want to fill them if they meet certain conditions. If there are non-missing values within 7 days before and 7 days after, use the average of the closest previous/next day to missing as the imputed value. If the closest non-missing value is outside this 7 day window (before or after), then keep as missing. If the first or last day of the data collected is missing, keep as missing. A few caveats I can think of are: -filling in these values must be done BY zipcode. -sometimes there are several consecutive missing values -this is a big dataset, so the more efficient, the better The way I am currently (not successfully) going about it is: get nearest before value (var1), nearest after value (var2), and average these --> but dont know how to account for the 7 day condition Thank you! data have;
input date zip temp;
datalines;
jan1 90001 50
jan2 90001 51
jan3 90001 53
jan4 90001 .
jan5 90001 49
jan6 90001 .
jan7 90001 .
jan8 90001 .
jan9 90001 50
jan10 90001 55
;
run;
data want;
input input date zip temp temp_new;
datalines;
jan1 90001 50 50
jan2 90001 51 51
jan3 90001 53 53
jan4 90001 . 51
jan5 90001 49 49
jan6 90001 . 49.5
jan7 90001 . 49.5
jan8 90001 . 49.5
jan9 90001 50 50
jan10 90001 55 55
;
run
... View more