06-17-2016 09:56 AM - last edited on 06-17-2016 10:18 AM by ChrisHemedinger
I have a SAS dataset of business population counts in Canada that was published by Statistics Canada. The data are published semi-annually. I currently have data for each of Dec. 2014 and June 2015 for business population counts by NAICS industry by employee size. However I would like to generate monthly counts by month between Dec. 2014 to June 2015. I have attached a picture of the data set. How would I code this in SAS? Thanks.
06-17-2016 12:39 PM
Do you want to just roll up the totals to the month level? Try this:
proc sql; create table work.output as select t1.ref_date, t1.size, /* totalcount */ (sum(t1.value)) format=best10. as totalcount from work.data t1 group by t1.ref_date, t1.size; quit;
But note that your date value is currently a CHAR -- not a SAS date. If you want to perform any sort of time series analysis you'll want to convert that to a date. You can convert it "on the fly" in SQL with an INPUT function and the ANYDTDTM informat.
proc sql; create table work.output as select datepart(input(t1.ref_date,anydtdtm7.)) as Month format=yymmd7., t1.size, /* totalcount */ (sum(t1.value)) format=best10. as totalcount from work.data t1 group by t1.ref_date, t1.size; quit;
06-17-2016 12:45 PM
Thanks for the reply Chris. I want to go to a higher frequency -- from semi-annual to monthly in my dataset. Currently I have observations for Dec. 2014 and June 2015 as shown. So I want to generate values by month (via linear interpolation). How would you modify the code you provided?
06-17-2016 01:01 PM
Oh, I see -- I misunderstood, thinking you already had monthly. Okay...
You still need to convert your ref_date to a date (number) value, so this piece still applies.
datepart(input(t1.ref_date,anydtdtm7.)) as Month format=yymmd7.,
Then as @Klara said, PROC EXPAND can help. It's part of SAS/ETS. If using SAS Enterprise Guide, check out Tasks->Time Series->Prepare Time Series Data. If you don't have SAS/ETS, you might have to look up a data step implementation of a similar interpolation in a SAS Global Forum paper. If you have just two categorical months, interpolation at the month level might be tricky to do with any reliable trending values.
06-20-2016 02:36 PM
Thanks for the help Chirs. In SAS EG I tried going the route you mention below but I got a message that informed me that I need SAS EG on my machine in order for this feature to work. I do not have ETS right now. Thanks for the input...Jack
06-18-2016 05:27 AM
Best choice is PROC EXPAND if you have ETS. OR try this code: data have; input x; cards; . 12 . . 24 25 . . 2 . ; run; data want; retain y1 y2; do i=1 by 1 until(not missing(x) or last1); set have end=last1; end; y1=x; do j=1 by 1 until(not missing(x) or last2); set have end=last2; if missing(x) then new_x=y2+j*(y1-y2)/i; else new_x=x; output; end; y2=x; drop y1 y2 i j; run; data want; set want; n+1; group=ifc(missing(x),'Interp','Raw'); run; proc sgplot data=want; series x=n y=new_x; scatter x=n y=new_x/group=group markerattrs=(symbol=starfilled); run; run;
06-20-2016 02:15 PM
06-20-2016 11:50 PM
Post data ,Not picture, Noone would like to type it for you .
data have; input date : anydtdte. size $ value; format date yymms.; cards; 2014/12 A 124 2014/12 B 70 2015/06 A 140 2015/06 B 80 ; run; proc sort data=have;by size date;run; data temp; merge have have(firstobs=2 keep=date size rename=(date=_date size=_size)); output; if size=_size then do; start=date; do i=1 to intck('month',date,_date)-1; date=intnx('month',start,i);value=.;output; end; end; drop i _: start; run; data want; retain y1 y2; do i=1 by 1 until(not missing(value) or last.size); set temp ; by size ; end; y1=value; do j=1 by 1 until(not missing(value) or last.size); set temp ; by size ; if missing(value) then new_x=y2+j*(y1-y2)/i; else new_x=value; output; end; y2=value; drop y1 y2 i j; run;