filling in missing values via interpolation

Reply
New Contributor
Posts: 3

filling in missing values via interpolation

Dear,

I have a big file with records sorted by date. The variables can have missing data.

I would like to fill in misiing data by interpolation, but the criterion is that the no. of

records with missing data should not be bigger than 6-7 records.

I know I can use PROC EXPAND, but how can I control the no. of records.

Thanks

Trusted Advisor
Posts: 1,022

Re: filling in missing values via interpolation

"the no. of records with missing data should not be bigger than 6-7 records.".  Do you mean 6-7 CONSECUTIVE records, or 6-7 missing records overall?

 

And are you saying that you want to leave such records as missing values, or do you want to drop the entire series (or maybe possibly entire series of the relevant BY group, if you have by groups)?

 

And if there were only 1 missing data, what sort of interpolation do you want?

 

regards,

Mark

Trusted Advisor
Posts: 1,022

Re: filling in missing values via interpolation

Say you have dataset HAVE with variable X.   You want to create X1 as follows:

    If X is non-missing then X1=X

    if X is part of a series of 7 or more missing values, then X1=.

    if X is otherwise missing then X1= a backward moving average of the last 10 records (MOVAVE 10)

 

I don't see how you can do that in a single proc expand,  but you could use PROC EXPAND to create dataset NEED, in which X1 is a trailing moving average for all missing values.  Then read in dataset NEED, and convert X1 to missing for all cases of 7 or more consecutive X's:

 

Regards,

Mark

 

proc expand data=have out=need method=none;
    convert x=x1 / transformout=(missonly movave 10);
run;

data want;
  /* count a sequence of records having a single X value*/
  do n=1 by 1 until (last.x);
    set need;
    by x notsorted;
  end;

  /* If the count>=7 and the X value is missing set a flag */
  if n>=7 and x=. then flag=1;

  /* Reread the same records, and if flag=1 set X1 to missing*/
  do until (last.x);
    set need;
    by x notsorted;
    if flag=1 then x1=.;
    output;
  end;
  drop flag n;
run;
Respected Advisor
Posts: 4,931

Re: filling in missing values via interpolation

Can you elaborate on your criterion. It is not clear what the criterion applies to. Providing a simple example would make things clearer.

PG
New Contributor
Posts: 3

filling in missing values via interpolation

Dear,

I have a big file with records sorted by date. The variables can have missing data.

I would like to fill in missing data by interpolation, but the criterion is that the no. of

records with missing data should not be bigger than 6-7 records.

I know I can use PROC EXPAND, but how can I control the no. of records.

Thanks

Super User
Posts: 10,044

Re: filling in missing values via interpolation

Make a group variable 

data have;
 set sashelp.air;
 if ranuni(0) lt 0.8 then call missing(air);
run;

data temp;
 set have;
 if not missing(air) then group+1;
run;


then PROC EXPEND

then DOW or Hash Table to call missing it for groups which's count greater than 6 


Ask a Question
Discussion stats
  • 5 replies
  • 295 views
  • 0 likes
  • 4 in conversation