Hello,
I have data in the following style:
Group | Date | NumberObservations | Numb. Obs. Last Year |
---|---|---|---|
M1XM | 01/2008 | 2 | . |
M1XM | 01/2009 | 221 | 2 |
M1XM | 02/2009 | 452 | 2 |
M1XM | 03/2009 | 132 | 2 |
M1XM | 05/2009 | 45 | 2 |
M1XM | 01/2010 | 2 | 221 |
M3GL | 01/2008 | 4 | . |
M3GL | 01/2009 | 15 | 4 |
M3GL | 02/2009 | 45 | 4 |
M3GL | 03/2009 | 13 | 4 |
M3GL | 05/2009 | 12 | 4 |
M3GL | 08/2009 | 1265 | 4 |
M3GL | 10/2009 | 985 | 4 |
M3GL | 12/2009 | 451 | 4 |
M3GL | 01/2010 | 352 | 15 |
M3GL | 02/2010 | 41 | 45 |
M3GL | 03/2010 | 7 | 13 |
M3GL | 04/2010 | 45 | . |
M3GL | 05/2010 | 62 | 12 |
Additional Informations: I need the data "Numb. Obs. Last Year". The data of the year 2008 does not contain a month, nevertheless are the data formated as date (mm/yyyy). For 2008 there are always 01/2008. I need the lagged observations from the month in the previous year. For 2009 I want to use for every month 01/2008. The Dataset is very large.
I tried it with lag function LAG<n>(argument), but my n differs. I'm able to calculate for each observation the n with a macro, but then I fail.
It would be great if anybody can provide me a solution or an approach of solving the problem.
Thank you very much in advance!
Emanuel
SQL Join is probably your best bet - you could do this in a data step merge also, if you set things up ahead of time with a variable for 'desired lag date'.
data have;
input group $ date :anydtdte. numobs desiredlag;
datalines;
M1XM 01/2008 2 .
M1XM 01/2009 221 2
M1XM 02/2009 452 2
M1XM 03/2009 132 2
M1XM 05/2009 45 2
M1XM 01/2010 2 221
M3GL 01/2008 4 .
M3GL 01/2009 15 4
M3GL 02/2009 45 4
M3GL 03/2009 13 4
M3GL 05/2009 12 4
M3GL 08/2009 1265 4
M3GL 10/2009 985 4
M3GL 12/2009 451 4
M3GL 01/2010 352 15
M3GL 02/2010 41 45
M3GL 03/2010 7 13
M3GL 04/2010 45 .
M3GL 05/2010 62 12
;;;;
run;
proc sql;
create table want as
select H.*, L.numobs as lyobs from
have H
left join
have L
on (H.group=L.group) and ((H.date = intnx('YEAR',l.date,1,'s'))
or (year(H.date)=2009 and year(L.date)=2008))
order by H.group, H.date;
quit;
I would do this using PROC TIMEDATA as follows. This will also add in the months that are missing.
If you don't wish to have those months listed, then you can simply say
outarray=outarray(where=(numobs)); thus only printing output where numobs is nonzero and non-missing....
The reason i suggest this over sql is that you are going to have more flexibility in your timeseries analysis in general,
proc timedata data=have outarray=outarray;
by group;
id date interval=month format=date9.;
var numobs;
outarray lagobs;
do t=1 to dim(numobs);
if year(date
lagobs
end;
if year(date
count=intck('month','01JAN2008'd,date
lagobs
end;
if year(date
previousyear=intnx('year',date
count=intck('month',previousyear,date
lagobs
end;
end;
run;
proc print data=outarray;run;
I tried it, but I haven't the PROC TIMEDATA. I have SAS Enterprise Guide 4.3.
Hi all,
I solved the problem by using proc panel.
Here creating lags is very easy.
Kind regards,
Emanuel
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.