Contributor
Posts: 40

# Calculating missing values using data step

Hi all,

My data looks like this.

data one;
infile datalines missover;
input column_1 column_2 column_3 column_4 column_5 column_6 column_7 column_8 column_9 column_10 column_11 column_12 column_13 column_14 column_15 column_16 \$ Start;

datalines;
. . . . . . . . . 52.5 . . . . . 68.10 10
run;

I want to interpolate values for columns column_11 to column_15 (the missing values between column_10 and column_16) based on formula

column_11 = column_10 + (((Column_16 - Column_10)/(16-10))*(11-10));

The Start denotes the column_name from which the interpolation(filling of missing values) should start. It will always have a value like column_16.

Thanks!

Super User
Posts: 13,542

## Re: Calculating missing values using data step

[ Edited ]

Something like:

``````data want;
set one;
array c {11:15} column_11-Column_15;

do i = 11 to 15;
if missing c[i] then c[i] =  column_10 + (((Column_16 - Column_10)/(16-10))*( i -10));
end;
drop i;
run;``````

perhaps.

Posts: 1,337

## Re: Calculating missing values using data step

This program will fill in all missing values, even if there are multiple sequences of missing values.  But it assumes that at least the COLUMN1 and COLUMN16 values are non-misisng.

``````data want (drop=diff ncol c begcol endcol);

set have;
array col{*} column_1-column_16;

begcol=whichn(.,of col{*});

do while (1<begcol<dim(cols));
/*find the last missing value in endcol*/
do endcol=begcol to dim(col)-1 until (col{endcol+1}^=.); end;

diff=col{endcol+1}-col{begcol-1};
ncols=1+endcol-begcol;

do c=1 to ncols;
col{begcol+(c-1)} = col{begcol-1} + diff*(c/ncols);
end;
begcol=whichn(.,of col{*});
end;
run;``````

Super User
Posts: 23,720

## Re: Calculating missing values using data step

Transpose your data and use PROC REG to fill in the missing values instead.

Posts: 1,337

## Re: Calculating missing values using data step

But proc reg will use ALL the non-missing values to make estimates.  I think the OP just wants to use the nearest non-missing value on each end of the missing sequence.

Super User
Posts: 23,720

## Re: Calculating missing values using data step

Maybe. I suspect the exact methodology isn't that specific.

SAS Super FREQ
Posts: 4,241