Forecasting using SAS Forecast Server, SAS/ETS, and more

Lagged variables for DiD analysis

Reply
Occasional Contributor
Posts: 5

Lagged variables for DiD analysis

Hello,

I have data in the following style:

GroupDateNumberObservationsNumb. Obs. Last Year
M1XM01/2008

2

.
M1XM01/2009221

2

M1XM02/20094522
M1XM03/20091322
M1XM05/2009452
M1XM01/20102221
M3GL01/20084.
M3GL01/2009154
M3GL02/2009454
M3GL03/2009134
M3GL05/2009124
M3GL08/200912654
M3GL10/20099854
M3GL12/20094514
M3GL01/201035215
M3GL02/20104145
M3GL03/2010713
M3GL04/201045.
M3GL05/20106212

Additional Informations: I need the data "Numb. Obs. Last Year". The data of the year 2008 does not contain a month, nevertheless are the data formated as date (mm/yyyy). For 2008 there are always 01/2008. I need the lagged observations from the month in the previous year. For 2009 I want to use for every month 01/2008. The Dataset is very large.

I tried it with lag function LAG<n>(argument), but my n differs. I'm able to calculate for each observation the n with a macro, but then I fail.

It would be great if anybody can provide me a solution or an approach of solving the problem.

Thank you very much in advance!

Emanuel

Regular Contributor
Posts: 244

Re: Lagged variables for DiD analysis

SQL Join is probably your best bet - you could do this in a data step merge also, if you set things up ahead of time with a variable for 'desired lag date'.

data have;

input group $ date :anydtdte. numobs desiredlag;

datalines;

M1XM 01/2008 2 .

M1XM 01/2009 221 2

M1XM 02/2009 452 2

M1XM 03/2009 132 2

M1XM 05/2009 45 2

M1XM 01/2010 2 221

M3GL 01/2008 4 .

M3GL 01/2009 15 4

M3GL 02/2009 45 4

M3GL 03/2009 13 4

M3GL 05/2009 12 4

M3GL 08/2009 1265 4

M3GL 10/2009 985 4

M3GL 12/2009 451 4

M3GL 01/2010 352 15

M3GL 02/2010 41 45

M3GL 03/2010 7 13

M3GL 04/2010 45 .

M3GL 05/2010 62 12

;;;;

run;

proc sql;

create table want as

  select H.*, L.numobs as lyobs from

  have H

  left join

  have L

  on (H.group=L.group) and ((H.date = intnx('YEAR',l.date,1,'s'))

  or (year(H.date)=2009 and year(L.date)=2008))

  order by H.group, H.date;

quit;

SAS Employee
Posts: 1

Re: Lagged variables for DiD analysis

I would do this using PROC TIMEDATA as follows. This will also add in the months that are missing.

If you don't wish to have those months listed, then you can simply say

outarray=outarray(where=(numobs)); thus only printing output where numobs is nonzero and non-missing....

The reason i suggest this over sql is that you are going to have more flexibility in your timeseries analysis in general,

proc timedata data=have outarray=outarray;

by group;

id date interval=month format=date9.;

var numobs;

outarray lagobs;

   do t=1 to dim(numobs);

        if year(date)=2008 then do;

           lagobs=.;

        end;

        if year(date)=2009 then do;

           count=intck('month','01JAN2008'd,date);

           lagobs=numobs[t-count];

        end;

       if year(date)>2009 then do;

      previousyear=intnx('year',date,-1,'s');

      count=intck('month',previousyear,date);

       lagobs=numobs[t-count];

      end;

   end;

run;

proc print data=outarray;run;

Occasional Contributor
Posts: 5

Re: Lagged variables for DiD analysis

I tried it, but I haven't the PROC TIMEDATA. I have SAS Enterprise Guide 4.3.

Occasional Contributor
Posts: 5

Re: Lagged variables for DiD analysis

Hi all,

I solved the problem by using proc panel.

Here creating lags is very easy.

Kind regards,

Emanuel

Ask a Question
Discussion stats
  • 4 replies
  • 706 views
  • 0 likes
  • 3 in conversation