BookmarkSubscribeRSS Feed
akhilesh_joshi
Fluorite | Level 6

Hi all,

My data looks like this. 

 

data one;
infile datalines missover;
input column_1 column_2 column_3 column_4 column_5 column_6 column_7 column_8 column_9 column_10 column_11 column_12 column_13 column_14 column_15 column_16 $ Start;

datalines;
. . . . . . . . . 52.5 . . . . . 68.10 10
run;

 

I want to interpolate values for columns column_11 to column_15 (the missing values between column_10 and column_16) based on formula

column_11 = column_10 + (((Column_16 - Column_10)/(16-10))*(11-10)); 

 

The Start denotes the column_name from which the interpolation(filling of missing values) should start. It will always have a value like column_16. 

 

 

Thanks!

6 REPLIES 6
ballardw
Super User

Something like:

data want;
   set one;
   array c {11:15} column_11-Column_15;

   do i = 11 to 15;
      if missing c[i] then c[i] =  column_10 + (((Column_16 - Column_10)/(16-10))*( i -10)); 
   end;
   drop i;
run;

perhaps.

 

mkeintz
PROC Star

This program will fill in all missing values, even if there are multiple sequences of missing values.  But it assumes that at least the COLUMN1 and COLUMN16 values are non-misisng.

 

 

 

 

data want (drop=diff ncol c begcol endcol);

  set have;
  array col{*} column_1-column_16;

  begcol=whichn(.,of col{*});

  do while (1<begcol<dim(cols));
    /*find the last missing value in endcol*/
    do endcol=begcol to dim(col)-1 until (col{endcol+1}^=.); end;

    diff=col{endcol+1}-col{begcol-1};
    ncols=1+endcol-begcol;

    do c=1 to ncols;
      col{begcol+(c-1)} = col{begcol-1} + diff*(c/ncols);
    end;
    begcol=whichn(.,of col{*});
  end;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

Transpose your data and use PROC REG to fill in the missing values instead.  

mkeintz
PROC Star

But proc reg will use ALL the non-missing values to make estimates.  I think the OP just wants to use the nearest non-missing value on each end of the missing sequence.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

Maybe. I suspect the exact methodology isn't that specific. 

Rick_SAS
SAS Super FREQ

See the SAS Usage Note about interpolation. For a more mathematical treatment, see the article "Linear interpolation in SAS", which also provides links to other SAS techniques.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2289 views
  • 0 likes
  • 5 in conversation