DATA Step, Macro, Functions and more

Calculating missing values using data step

Reply
Contributor
Posts: 40

Calculating missing values using data step

Hi all,

My data looks like this. 

 

data one;
infile datalines missover;
input column_1 column_2 column_3 column_4 column_5 column_6 column_7 column_8 column_9 column_10 column_11 column_12 column_13 column_14 column_15 column_16 $ Start;

datalines;
. . . . . . . . . 52.5 . . . . . 68.10 10
run;

 

I want to interpolate values for columns column_11 to column_15 (the missing values between column_10 and column_16) based on formula

column_11 = column_10 + (((Column_16 - Column_10)/(16-10))*(11-10)); 

 

The Start denotes the column_name from which the interpolation(filling of missing values) should start. It will always have a value like column_16. 

 

 

Thanks!

Super User
Posts: 11,343

Re: Calculating missing values using data step

[ Edited ]
Posted in reply to akhilesh_joshi

Something like:

data want;
   set one;
   array c {11:15} column_11-Column_15;

   do i = 11 to 15;
      if missing c[i] then c[i] =  column_10 + (((Column_16 - Column_10)/(16-10))*( i -10)); 
   end;
   drop i;
run;

perhaps.

 

Trusted Advisor
Posts: 1,022

Re: Calculating missing values using data step

Posted in reply to akhilesh_joshi

This program will fill in all missing values, even if there are multiple sequences of missing values.  But it assumes that at least the COLUMN1 and COLUMN16 values are non-misisng.

 

 

 

 

data want (drop=diff ncol c begcol endcol);

  set have;
  array col{*} column_1-column_16;

  begcol=whichn(.,of col{*});

  do while (1<begcol<dim(cols));
    /*find the last missing value in endcol*/
    do endcol=begcol to dim(col)-1 until (col{endcol+1}^=.); end;

    diff=col{endcol+1}-col{begcol-1};
    ncols=1+endcol-begcol;

    do c=1 to ncols;
      col{begcol+(c-1)} = col{begcol-1} + diff*(c/ncols);
    end;
    begcol=whichn(.,of col{*});
  end;
run;

 

Super User
Posts: 19,861

Re: Calculating missing values using data step

Posted in reply to akhilesh_joshi

Transpose your data and use PROC REG to fill in the missing values instead.  

Trusted Advisor
Posts: 1,022

Re: Calculating missing values using data step

But proc reg will use ALL the non-missing values to make estimates.  I think the OP just wants to use the nearest non-missing value on each end of the missing sequence.

Super User
Posts: 19,861

Re: Calculating missing values using data step

Maybe. I suspect the exact methodology isn't that specific. 

SAS Super FREQ
Posts: 3,755

Re: Calculating missing values using data step

Posted in reply to akhilesh_joshi

See the SAS Usage Note about interpolation. For a more mathematical treatment, see the article "Linear interpolation in SAS", which also provides links to other SAS techniques.

Ask a Question
Discussion stats
  • 6 replies
  • 183 views
  • 0 likes
  • 5 in conversation