BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AnjaV
Calcite | Level 5

Dear Sas Community,

I am struggeling with a problem. I want to create lagged values of the variable asset. My dataset consists of observations of different firms and years (see example below).I use:

 

Data Have;

Input FirmID year AssT;

Datalines;

1 1990 100

1 1991 10

1 1994 3

1 1996 5

2 1990 100

2 1991 10

2 1992 3

2 1993 5

;

Run;

Data Lagged1 (drop=i count);

Set Have;

By firmID year;

array x(*) Assets1-Assets5;

Assets1=lag1 (AssT);

Assets2=lag2 (AssT);

Assets3=lag3 (AssT);

Assets4=lag4 (AssT);

Assets5=lag5 (AssT);

if first.firmid then count=1;

do i=count to dim(x);

x(i)=.;

end;

count + 1;

run;

 

The problem is that the time interval between the observations of the same firm is not always exactly one year. So I get false lagged results.

 

Do you have an idea how to alter the code so that it also considers the different time intervals?

 

Many thanks in advance

Best

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

It looks like there should be an easy way, but I only see a method with a lot of details.  So here goes.

 

data want;

set have;

by firmid year;

array firms {5} $;

array assets {5};

array yrdifs {5};

array amounts {5} lag1Asst lag2Asst lag3Asswt lag4Asst lag5Asst;

firms1 = lag1(firm); firms2 = lag2(firm); firms3 = lag3(firm); firms4 = lag4(firms); firms5 = lag5(firms);

assets1 = lag1(AssT); assets2 = lag2(AssT); assets3 = lag3(AssT); assets4 = lag4(AssT); assets5 = lag5(AssT);

yrdifs1 = dif1(year); yrdifs2 = dif2(year); yrdifs3 = dif3(year); yrdifs4 = dif4(year); yrdifs5 = dif5(year);

do _n_=1 to 5;

   if firms{_n_} = firm then do;

      if (1 <= yrdifs{_n_} <= 5) then amounts{yrdifs{_n_}} = assets{_n_};

   end;

end;

drop firm1-firm5 assets1-assets5 yrdifs1-yrdifs5;

run;

 

It's untested, and I wish I could think of something simpler.  But at least it ought to work. 

View solution in original post

7 REPLIES 7
Astounding
PROC Star

You'll have to define that a little more.  "Considers the different time intervals"?  What would you like the results to be for your sample data?

AnjaV
Calcite | Level 5

Hello astounding,

thanks for the quick reply. This is how I would like the results to be.

 

Data Have;

Input FirmID year AssT Lag1Asst Lag2Asst Lag3Asst;

Datalines;

1 1990 100 . . .

1 1991 10 100 . .

1 1994 3 . . 10

1 1996 5 . 3 . .

2 1990 100 . . .

2 1991 10 100 . .

2 1992 3 10 100 .

2 1993 5 3 10 100

;

Run;

 

Best

Anja

ballardw
Super User

I suspect that you may need to create lags of the Firmid and then something like:

 

If firmid = LagFirmid1 then <do something> ;

but it isn't quite clear.

You may also want temporary variables to hold the lag of asst and conditionally assign x[i] to the temporary variable. The drop the LagFirm and the asset temporaries.

 

When you have a number of variables to set to missing you may want to consider call missing:

 

Call missing (of x(*)); in the case of an entirer array or

Call missing (x,y, z, p, d, q);

Nice thing about call missing is you can mix variable types unlike most operations.

 

I might guess you are attempting some like this:

Data Lagged1 (drop=i );
   Set Have;
   By firmID year;
   array x(*) Assets1-Assets5;
   LAsst1=lag1 (AssT);
   LAsst2=lag2 (AssT);
   LAsst3=lag3 (AssT);
   LAsst4=lag4 (AssT);
   LAsst5=lag5 (AssT);
   LFirmId1=lag1 (FirmId);
   LFirmId2=lag2 (FirmId);
   LFirmId3=lag3 (FirmId);
   LFirmId4=lag4 (FirmId);
   LFirmId5=lag5 (FirmId);
   array LA Lasst1-LAsst5;
   array f LFirmId1-LFirmId5;
   call missing (of x(*));
   do i=1 to dim(x);
      if f[i]=firmid then x[i]= LA[i];
   end;
   drop Lf: LA:;
run;
Astounding
PROC Star

It looks like there should be an easy way, but I only see a method with a lot of details.  So here goes.

 

data want;

set have;

by firmid year;

array firms {5} $;

array assets {5};

array yrdifs {5};

array amounts {5} lag1Asst lag2Asst lag3Asswt lag4Asst lag5Asst;

firms1 = lag1(firm); firms2 = lag2(firm); firms3 = lag3(firm); firms4 = lag4(firms); firms5 = lag5(firms);

assets1 = lag1(AssT); assets2 = lag2(AssT); assets3 = lag3(AssT); assets4 = lag4(AssT); assets5 = lag5(AssT);

yrdifs1 = dif1(year); yrdifs2 = dif2(year); yrdifs3 = dif3(year); yrdifs4 = dif4(year); yrdifs5 = dif5(year);

do _n_=1 to 5;

   if firms{_n_} = firm then do;

      if (1 <= yrdifs{_n_} <= 5) then amounts{yrdifs{_n_}} = assets{_n_};

   end;

end;

drop firm1-firm5 assets1-assets5 yrdifs1-yrdifs5;

run;

 

It's untested, and I wish I could think of something simpler.  But at least it ought to work. 

AnjaV
Calcite | Level 5

Great, thank you very much. Works perfectly!!!!

 

Is it possible to modify your code so that it produces look-ahead values instead of lagged values?

Astounding
PROC Star

Not really.  SAS has decent tools for looking backward and a much more limited set of tools for looking forward through the data.  It is posible to program that, but the program wouldn't resemble this program.

mkeintz
PROC Star

You later asked about getting leads, for which there is no explicit SAS functions.  However, you can take advantage of the FIRSTOBS, KEEP, and RENAME parameters for data set names.

 

The program below assumes your data are sorted by FIRMID/YEAR.  You want leads up to 6 years in advance, but there may be "holes" in the years subsequent to the current record (i.e. current year may be 1998, but it is followed by 2000,2001,2003,2004 - missing are 1999 and 2002):

 

Notes:

  1. MERGE here purposely does NOT have a BY statement, and will not work correctly if it does.
  2. Each of the FIRSTOBS= datasets has an explicit list of KEEP variables and a corresponding RENAME list.

 



data want;
  merge have
        have (firstobs=2 keep=firmid year assets rename=(firmid=fid1 assets=ast1 year=yr1))
        have (firstobs=3 keep=firmid year assets rename=(firmid=fid2 assets=ast2 year=yr2))
        have (firstobs=4 keep=firmid year assets rename=(firmid=fid3 assets=ast3 year=yr3))
        have (firstobs=5 keep=firmid year assets rename=(firmid=fid4 assets=ast4 year=yr4))
        have (firstobs=6 keep=firmid year assets rename=(firmid=fid5 assets=ast5 year=yr5))
        have (firstobs=7 keep=firmid year assets rename=(firmid=fid6 assets=ast6 year=yr6))
        ;

  array ldassets {6};  /* lead values of assets, for year+1 through year+6 */

  array yr {6};
  array ast {6};
  array fid{6};

  do I=1 to 6 while (  fid{I}=firmid  and  (yr{I}-year)<= 6 );
    ldassets{yr{I}-year}=ast{I};
  end;
  drop I yr: ast: fid: ;

run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1219 views
  • 0 likes
  • 4 in conversation