turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Linear interpolation in SAS

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 09:56 AM - last edited on 06-17-2016 10:18 AM by ChrisHemedinger

I have a SAS dataset of business population counts in Canada that was published by Statistics Canada. The data are published semi-annually. I currently have data for each of Dec. 2014 and June 2015 for business population counts by NAICS industry by employee size. However I would like to generate monthly counts by month between Dec. 2014 to June 2015. I have attached a picture of the data set. How would I code this in SAS? Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 10:24 AM

I guess, that PROC EXPAND could solve it

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 12:39 PM

Do you want to just roll up the totals to the month level? Try this:

```
proc sql;
create table work.output as
select t1.ref_date,
t1.size,
/* totalcount */
(sum(t1.value)) format=best10. as totalcount
from work.data t1
group by t1.ref_date,
t1.size;
quit;
```

But note that your date value is currently a CHAR -- not a SAS date. If you want to perform any sort of time series analysis you'll want to convert that to a date. You can convert it "on the fly" in SQL with an INPUT function and the ANYDTDTM informat.

```
proc sql;
create table work.output as
select datepart(input(t1.ref_date,anydtdtm7.)) as Month format=yymmd7.,
t1.size,
/* totalcount */
(sum(t1.value)) format=best10. as totalcount
from work.data t1
group by t1.ref_date,
t1.size;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 12:45 PM

Thanks for the reply Chris. I want to go to a higher frequency -- from semi-annual to monthly in my dataset. Currently I have observations for Dec. 2014 and June 2015 as shown. So I want to generate values by month (via linear interpolation). How would you modify the code you provided?

Thanks again...Jack

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 01:01 PM

Oh, I see -- I misunderstood, thinking you already had monthly. Okay...

You still need to convert your **ref_date** to a date (number) value, so this piece still applies.

datepart(input(t1.ref_date,anydtdtm7.)) as Month format=yymmd7.,

Then as @Klara said, PROC EXPAND can help. It's part of SAS/ETS. If using SAS Enterprise Guide, check out Tasks->Time Series->Prepare Time Series Data. If you don't have SAS/ETS, you might have to look up a data step implementation of a similar interpolation in a SAS Global Forum paper. If you have just two categorical months, interpolation at the month level might be tricky to do with any reliable trending values.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-17-2016 02:44 PM

Thanks Chris....I will give this a try and let you know how it goes...

Jack

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2016 02:36 PM

Thanks for the help Chirs. In SAS EG I tried going the route you mention below but I got a message that informed me that I need SAS EG on my machine in order for this feature to work. I do not have ETS right now. Thanks for the input...Jack

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-18-2016 04:42 AM

If you are using IML . Check Rick's blog: http://blogs.sas.com/content/iml/2012/03/16/linear-interpolation-in-sas.html

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-18-2016 05:27 AM

Best choice is PROC EXPAND if you have ETS. OR try this code: data have; input x; cards; . 12 . . 24 25 . . 2 . ; run; data want; retain y1 y2; do i=1 by 1 until(not missing(x) or last1); set have end=last1; end; y1=x; do j=1 by 1 until(not missing(x) or last2); set have end=last2; if missing(x) then new_x=y2+j*(y1-y2)/i; else new_x=x; output; end; y2=x; drop y1 y2 i j; run; data want; set want; n+1; group=ifc(missing(x),'Interp','Raw'); run; proc sgplot data=want; series x=n y=new_x; scatter x=n y=new_x/group=group markerattrs=(symbol=starfilled); run; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2016 02:15 PM

Hi Xia, Thanks for the reply. The data I have is semi-annual data (Dec. 2014 and June 2015 periods only) on business population counts by region from Statistics Canada. I want expand the series to show the monthly values between these two months (Jan. 2015, Feb. 2015, etc.). So the code would need to replicate the months between these end dates. Plus I want to include the remaining fields as shown in the screen shot. How can I use your code to generate this expanded data set? Thanks Jack

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2016 11:50 PM

Post data ,Not picture, Noone would like to type it for you .

```
data have;
input date : anydtdte. size $ value;
format date yymms.;
cards;
2014/12 A 124
2014/12 B 70
2015/06 A 140
2015/06 B 80
;
run;
proc sort data=have;by size date;run;
data temp;
merge have have(firstobs=2 keep=date size rename=(date=_date size=_size));
output;
if size=_size then do;
start=date;
do i=1 to intck('month',date,_date)-1;
date=intnx('month',start,i);value=.;output;
end;
end;
drop i _: start;
run;
data want;
retain y1 y2;
do i=1 by 1 until(not missing(value) or last.size);
set temp ;
by size ;
end;
y1=value;
do j=1 by 1 until(not missing(value) or last.size);
set temp ;
by size ;
if missing(value) then new_x=y2+j*(y1-y2)/i;
else new_x=value;
output;
end;
y2=value;
drop y1 y2 i j;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-21-2016 11:48 AM

That's great! Thanks for the example & code.....Jack