DATA Step, Macro, Functions and more

How to derive lagged values (considering different time intervals )?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

How to derive lagged values (considering different time intervals )?

Dear Sas Community,

I am struggeling with a problem. I want to create lagged values of the variable asset. My dataset consists of observations of different firms and years (see example below).I use:

 

Data Have;

Input FirmID year AssT;

Datalines;

1 1990 100

1 1991 10

1 1994 3

1 1996 5

2 1990 100

2 1991 10

2 1992 3

2 1993 5

;

Run;

Data Lagged1 (drop=i count);

Set Have;

By firmID year;

array x(*) Assets1-Assets5;

Assets1=lag1 (AssT);

Assets2=lag2 (AssT);

Assets3=lag3 (AssT);

Assets4=lag4 (AssT);

Assets5=lag5 (AssT);

if first.firmid then count=1;

do i=count to dim(x);

x(i)=.;

end;

count + 1;

run;

 

The problem is that the time interval between the observations of the same firm is not always exactly one year. So I get false lagged results.

 

Do you have an idea how to alter the code so that it also considers the different time intervals?

 

Many thanks in advance

Best


Accepted Solutions
Solution
‎12-13-2016 10:00 AM
Super User
Posts: 5,516

Re: How to derive lagged values (considering different time intervals )?

It looks like there should be an easy way, but I only see a method with a lot of details.  So here goes.

 

data want;

set have;

by firmid year;

array firms {5} $;

array assets {5};

array yrdifs {5};

array amounts {5} lag1Asst lag2Asst lag3Asswt lag4Asst lag5Asst;

firms1 = lag1(firm); firms2 = lag2(firm); firms3 = lag3(firm); firms4 = lag4(firms); firms5 = lag5(firms);

assets1 = lag1(AssT); assets2 = lag2(AssT); assets3 = lag3(AssT); assets4 = lag4(AssT); assets5 = lag5(AssT);

yrdifs1 = dif1(year); yrdifs2 = dif2(year); yrdifs3 = dif3(year); yrdifs4 = dif4(year); yrdifs5 = dif5(year);

do _n_=1 to 5;

   if firms{_n_} = firm then do;

      if (1 <= yrdifs{_n_} <= 5) then amounts{yrdifs{_n_}} = assets{_n_};

   end;

end;

drop firm1-firm5 assets1-assets5 yrdifs1-yrdifs5;

run;

 

It's untested, and I wish I could think of something simpler.  But at least it ought to work. 

View solution in original post


All Replies
Super User
Posts: 5,516

Re: How to derive lagged values (considering different time intervals )?

You'll have to define that a little more.  "Considers the different time intervals"?  What would you like the results to be for your sample data?

Occasional Contributor
Posts: 11

Re: How to derive lagged values (considering different time intervals )?

Posted in reply to Astounding

Hello astounding,

thanks for the quick reply. This is how I would like the results to be.

 

Data Have;

Input FirmID year AssT Lag1Asst Lag2Asst Lag3Asst;

Datalines;

1 1990 100 . . .

1 1991 10 100 . .

1 1994 3 . . 10

1 1996 5 . 3 . .

2 1990 100 . . .

2 1991 10 100 . .

2 1992 3 10 100 .

2 1993 5 3 10 100

;

Run;

 

Best

Anja

Super User
Posts: 11,343

Re: How to derive lagged values (considering different time intervals )?

I suspect that you may need to create lags of the Firmid and then something like:

 

If firmid = LagFirmid1 then <do something> ;

but it isn't quite clear.

You may also want temporary variables to hold the lag of asst and conditionally assign x[i] to the temporary variable. The drop the LagFirm and the asset temporaries.

 

When you have a number of variables to set to missing you may want to consider call missing:

 

Call missing (of x(*)); in the case of an entirer array or

Call missing (x,y, z, p, d, q);

Nice thing about call missing is you can mix variable types unlike most operations.

 

I might guess you are attempting some like this:

Data Lagged1 (drop=i );
   Set Have;
   By firmID year;
   array x(*) Assets1-Assets5;
   LAsst1=lag1 (AssT);
   LAsst2=lag2 (AssT);
   LAsst3=lag3 (AssT);
   LAsst4=lag4 (AssT);
   LAsst5=lag5 (AssT);
   LFirmId1=lag1 (FirmId);
   LFirmId2=lag2 (FirmId);
   LFirmId3=lag3 (FirmId);
   LFirmId4=lag4 (FirmId);
   LFirmId5=lag5 (FirmId);
   array LA Lasst1-LAsst5;
   array f LFirmId1-LFirmId5;
   call missing (of x(*));
   do i=1 to dim(x);
      if f[i]=firmid then x[i]= LA[i];
   end;
   drop Lf: LA:;
run;
Solution
‎12-13-2016 10:00 AM
Super User
Posts: 5,516

Re: How to derive lagged values (considering different time intervals )?

It looks like there should be an easy way, but I only see a method with a lot of details.  So here goes.

 

data want;

set have;

by firmid year;

array firms {5} $;

array assets {5};

array yrdifs {5};

array amounts {5} lag1Asst lag2Asst lag3Asswt lag4Asst lag5Asst;

firms1 = lag1(firm); firms2 = lag2(firm); firms3 = lag3(firm); firms4 = lag4(firms); firms5 = lag5(firms);

assets1 = lag1(AssT); assets2 = lag2(AssT); assets3 = lag3(AssT); assets4 = lag4(AssT); assets5 = lag5(AssT);

yrdifs1 = dif1(year); yrdifs2 = dif2(year); yrdifs3 = dif3(year); yrdifs4 = dif4(year); yrdifs5 = dif5(year);

do _n_=1 to 5;

   if firms{_n_} = firm then do;

      if (1 <= yrdifs{_n_} <= 5) then amounts{yrdifs{_n_}} = assets{_n_};

   end;

end;

drop firm1-firm5 assets1-assets5 yrdifs1-yrdifs5;

run;

 

It's untested, and I wish I could think of something simpler.  But at least it ought to work. 

Occasional Contributor
Posts: 11

Re: How to derive lagged values (considering different time intervals )?

Posted in reply to Astounding

Great, thank you very much. Works perfectly!!!!

 

Is it possible to modify your code so that it produces look-ahead values instead of lagged values?

Super User
Posts: 5,516

Re: How to derive lagged values (considering different time intervals )?

Not really.  SAS has decent tools for looking backward and a much more limited set of tools for looking forward through the data.  It is posible to program that, but the program wouldn't resemble this program.

Trusted Advisor
Posts: 1,022

Re: How to derive lagged values (considering different time intervals )?

You later asked about getting leads, for which there is no explicit SAS functions.  However, you can take advantage of the FIRSTOBS, KEEP, and RENAME parameters for data set names.

 

The program below assumes your data are sorted by FIRMID/YEAR.  You want leads up to 6 years in advance, but there may be "holes" in the years subsequent to the current record (i.e. current year may be 1998, but it is followed by 2000,2001,2003,2004 - missing are 1999 and 2002):

 

Notes:

  1. MERGE here purposely does NOT have a BY statement, and will not work correctly if it does.
  2. Each of the FIRSTOBS= datasets has an explicit list of KEEP variables and a corresponding RENAME list.

 



data want;
  merge have
        have (firstobs=2 keep=firmid year assets rename=(firmid=fid1 assets=ast1 year=yr1))
        have (firstobs=3 keep=firmid year assets rename=(firmid=fid2 assets=ast2 year=yr2))
        have (firstobs=4 keep=firmid year assets rename=(firmid=fid3 assets=ast3 year=yr3))
        have (firstobs=5 keep=firmid year assets rename=(firmid=fid4 assets=ast4 year=yr4))
        have (firstobs=6 keep=firmid year assets rename=(firmid=fid5 assets=ast5 year=yr5))
        have (firstobs=7 keep=firmid year assets rename=(firmid=fid6 assets=ast6 year=yr6))
        ;

  array ldassets {6};  /* lead values of assets, for year+1 through year+6 */

  array yr {6};
  array ast {6};
  array fid{6};

  do I=1 to 6 while (  fid{I}=firmid  and  (yr{I}-year)<= 6 );
    ldassets{yr{I}-year}=ast{I};
  end;
  drop I yr: ast: fid: ;

run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 239 views
  • 0 likes
  • 4 in conversation