Solved
Contributor
Posts: 63

# Fill missing value with the average of surrounding values

[ Edited ]

Hi, I currently have a problem that requires filling the missing value with surrounding value mean. For example:

data have;
input a ;
datalines;
1
2
.
.
.
2
;
run;

Here is what I need:

data want;
input a ;
datalines;
1
2
1.5
1.25
1.5
2
;
run;
As you may notice, the filling method is when there is only one missing, simply take the average of the surrounding values. If there are two continuous missing value, take the average of surrounding values to fill the first missing, then use the same method as only one missing value to fill the second missing.
Any suggestion would be appreciated. Thank you.

Accepted Solutions
Solution
‎03-31-2018 10:50 AM
Posts: 5,474

## Re: Fill missing value with the average of surrounding values

What is the advantage of such an interpolation method? Its simplicity maybe?

``````data have;
input a;
datalines;
1
2
2
.
.
1
.
2
;

data want;
retain _a;
do nb = 1 by 1 until(not missing(a_));
set have(rename=a=a_);
end;
do i = 1 to nb;
set have;
if missing(a) then a = (_a + a_) / 2;
_a = a;
output;
end;
drop nb i _a a_;
run;

proc print; run;
``````
PG

All Replies
Valued Guide
Posts: 558

## Re: Fill missing value with the average of surrounding values

PROC EXPAND can be used for interpolating the moving average. But here depending on your condition you need to make some looping to make sure that there is only one missing value between consecutive records to get the output that you desired.

``````data have;
input a ;
obs=_n_;
datalines;
1
2
2
.
.
1
.
2
;
run;
data count_max_mis;
set have ;
if not missing(a) then count=0;
else count+1;
run;

proc sql;
select max(count) into: n_miss
from count_max_mis;
quit;

%macro interpolate_miss();
%DO i=1 %to &n_miss;

data Inter;
set have;
if not missing(a) then call symput("value","NOTNULL");
else if missing(a) and SYMGET("value")="NOTNULL" then call symput("value","NULL");
else if missing(a) then delete;
run;

PROC EXPAND data=Inter out=Updated_miss(drop=time) method=none;
convert a / transformout=(missonly  cmovave 3 );
run;

data Have;
update have Updated_miss;
by obs;
run;
%end;
%MEND interpolate_miss;
%interpolate_miss;``````
Thanks,
Suryakiran
Contributor
Posts: 63

## Re: Fill missing value with the average of surrounding values

Solution
‎03-31-2018 10:50 AM
Posts: 5,474

## Re: Fill missing value with the average of surrounding values

What is the advantage of such an interpolation method? Its simplicity maybe?

``````data have;
input a;
datalines;
1
2
2
.
.
1
.
2
;

data want;
retain _a;
do nb = 1 by 1 until(not missing(a_));
set have(rename=a=a_);
end;
do i = 1 to nb;
set have;
if missing(a) then a = (_a + a_) / 2;
_a = a;
output;
end;
drop nb i _a a_;
run;

proc print; run;
``````
PG
Contributor
Posts: 63

## Re: Fill missing value with the average of surrounding values

Thank you for the reply. It works well.
Super User
Posts: 10,681

## Re: Fill missing value with the average of surrounding values

``````data have;
input a ;
datalines;
1
2
2
.
.
1
.
2
;
run;

data want;
set have end=last;
array x{1000} _temporary_;
x{_n_}=a;
if last then do;
do i=1 to _n_;
if missing(x{i}) then do;
prev=x{i-1};next=.;
do j=i+1 to _n_;
if not missing(x{j}) then do;next=x{j};leave;end;
end;
x{i}=(prev+next)/2;
end;
end;

do k=1 to _n_;
a=x{k};output;
end;
end;
keep a;
run;``````
☑ This topic is solved.