Is it possible to interpolate dates using start and end dates ?
Also is it possible to interpolate missing dates based on some other variable in the dataset?
Longitude | Dates | Quantity |
24.5 | 24May2018 | 340000 |
25.5 | missing | 340000 |
26 | missing | 320000 |
27 | missing | 300000 |
28 | missing | 280000 |
28.5 | missing | 240000 |
29 | missing | 210000 |
30 | missing | 180000 |
30.3 | 31Dec2018 | 175000 |
Because SAS can stores date values as numbers (number of days after 01jan1960, or negative values for prior to 01jan1960), the answer is yes, you can interpolate dates.
I presume you want to take, for each longitude, it's relative position between longitudes with know dates, and apply that same relative progress to the time span between the known dates, as in:
proc format ; invalue indate 'missing'=. other=[date9.]; run; data have; input Longitude Date :indate9. Quantity; format date date9. ; datalines; 24.5 24May2018 340000 25.5 missing 340000 26 missing 320000 27 missing 300000 28 missing 280000 28.5 missing 240000 29 missing 210000 30 missing 180000 30.3 31Dec2018 175000 run; data want (drop=_:); do _n=1 by 1 until (date^=.); set have; end; _startdate=lag(date); _ndays=dif(date); /* dif(date) means date-lag(date) */ _startlong=lag(longitude); _difflong=dif(longitude); do _i=1 to _n ; set have; if _difflong^=. and date=. then date = _startdate + _ndays*(longitude-_startlong)/_difflong; output; end; run;
But please be aware that the interpolated date values above need not be integers, even though displaying them with the date9. format will look the same as if they were integers. For instance the underlying date value for 24May2018 is 21328. But if you had a value of, say 21328.99, and you applied the date9. format to its display, it would still appear as 24MAY2018.
You could make all the interpolated values into integers, if you wish. But if you do so, I'd suggest the FLOOR function (instead of round, or ceiling), just to be consistent with the date9. format.
Once upon a time I had a project where I had to implement interpolation of times by almost arbitrary intervals: 5, 10, 15, and 30 minutes, 1, 2, 4, 6, and 12 hours. In ANSI FORTRAN, which had at the time a very limited time sense. I prototyped the project in SAS because the time functions were robust. Then I could check my FORTRAN code results against something.
You should specify what rule for the intervals should be.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.