I am trying to interpolate data that has many missing values using proc expand. My problem is that I do not want to interpolate if two or more years of data are missing. From example below: I want to interpolate by item. Data from 2001-2006 are missing and I do not wish to interpolate values for 2007 from preceding data, because estimates would be inaccurate. But, I want to be able to use the data from 2007 to be able to interpolate values for 2008 on out. So, values in 2007 that are missing will remain missing during the interpolation, but the available values in 2007 (only the december value) can be linked to 2008. Values are annual moving averages computed monthly, so I will have an annual average for 2007 even though I will have missing values for jan-nov. The omit var is an id var indicating that I wish to omit from interpolation because 2+ years are missing before it. See sample code below. Example of data: item year month value Omit a 2000 1 # . ... ... .. ... ... a 2000 12 # . a 2007 1 . 1 *Here is the problem: item is only purchased for these two years but I need moving average to start over here. a 2007 2 . 1 ... ... .. ... ... a 2007 12 # 1 a 2008 1 # . ... ... .. ... ... a 2008 12 # . a 2009 1 # . ... ... .. ... ... a 2009 12 # . proc expand data=old method=spline out=new from=month outtest=test; by itemclss item; id date; where omit~=1; **?????** convert quantity value / observed=total transformin=(log) transformout=(exp); run; When I add the where statement, it removes the obs where omit=1, but still interpolates all values between and including 2000 and 2007. So, it creates outrageous values for months in 2001-2006. I just want to skip those observations and restart interpolation in 2007. Is there any way I can create some sort of where statement, do loop, or macro even to accomplish this? -OR- is there another procedure that computes cubic splines and can handle conditional statements?
... View more