BookmarkSubscribeRSS Feed
pecon1
Fluorite | Level 6

I am trying to interpolate data that has many missing values using proc expand. My problem is that I do not want to interpolate if two or more years of data are missing. From example below: I want to interpolate by item. Data from 2001-2006 are missing and I do not wish to interpolate values for 2007 from preceding data, because estimates would be inaccurate. But, I want to be able to use the data from 2007 to be able to interpolate values for 2008 on out. So, values in 2007 that are missing will remain missing during the interpolation, but the available values in 2007 (only the december value) can be linked to 2008.

Values are annual moving averages computed monthly, so I will have an annual average for 2007 even though I will have missing values for jan-nov. The omit var is an id var indicating that I wish to omit from interpolation because 2+ years are missing before it. See sample code below.

Example of data:

item     year     month     value     Omit

a          2000     1               #          .

...          ...          ..          ...          ...

a          2000     12             #          .
a          2007      1             .           1     *Here is the problem: item is only purchased for these two years but I need moving average to start over here.

a          2007      2             .           1  

...          ...          ..          ...          ...

a          2007      12           #           1

a          2008     1             #            .

...          ...          ..          ...          ...

a          2008     12            #           .

a          2009     1              #           .

...          ...          ..          ...          ...

a          2009     12            #           .

proc expand data=old method=spline

     out=new from=month outtest=test;

     by itemclss item;

     id date;

     where omit~=1; **?????**

     convert quantity value /  observed=total

               transformin=(log)

               transformout=(exp);

run;

When I add the where statement, it removes the obs where omit=1, but still interpolates all values between and including 2000 and 2007. So, it creates outrageous values for months in 2001-2006. I just want to skip those observations and restart interpolation in 2007.

Is there any way I can create some sort of where statement, do loop, or macro even to accomplish this? -OR- is there another procedure that computes cubic splines and can handle conditional statements?

2 REPLIES 2
pecon1
Fluorite | Level 6

I found a solution to this problem. I simply used a count variable in date format to create a mock date variable. If you are having a similar problem and would like specifics, let me know.

dearfisher
SAS Employee

If you do not wish to interpolate the value before 2007 Dec, you can set them missing and then interpolation would begin from 2007 Dec.

You said you found a good solution, would you please share it with me? thanks.

Another question, as you described in  https://communities.sas.com/thread/39579?start=0&tstart=0

If  a lot of continuous missing values occurred in one time series,  using proc expand will product the extreme value which is unreasonable.

In my experience, using mean to fill will not make sense if the series have several high values.

I try to discard the period having too long missing or fill the missing use the value in the same point (e.g., use 2011 Aug to fill 2012 Aug ).but until now, I do not have a good solution for it.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1422 views
  • 0 likes
  • 2 in conversation