BookmarkSubscribeRSS Feed
Akarsh91
Calcite | Level 5

Hi, 

 

I am working on the fama french paper and want to estimate the pre ranking betas. These betas are based on returns that each securities had 5 years ago. I used the following code in an attempt to achieve this;

 

 

data first;

set size1(keep=permno date);

by permno;

retain firstdate;

date=intnx('month', date, 1)-1;

if first.permno then firstdate=date;

if last.permno then do;

lastdate=date;

output;

end;

run;

 

data firstandlastdates;

set first;

format firstdate yymmdd10.;

format lastdate yymmdd10.;

run;

 

data permnosrankdates(rename=(date=rankdate));

set firstandlastdates;

date=firstdate;

do while(date<=lastdate);

output;

date=intnx('month', date+1, 1)-1;

end;

run;

 

data permnosrankdates;

set permnosrankdates;

date=rankdate;

i=1;

do while(i<=60);

output;

date=intnx ('month', date, 0)-1;

i=i+1;

end;

run;

 

data permnosrankdates;

set permnosrankdates;

format date yymmdd10.;

run;

 

data ff;

set size1 (keep=date ret vwretd exchcd);

date=intnx('month', date, 1)-1;

run;

 

proc sort data=permnosrankdates;

by date permno;

run;

proc sort data=ff;

by date;

run;

 

 

data permnosrankdates;

merge permnosrankdates(in=a) ff(in=b);

by date;

if a and b;

run;

 

data msf;

set size1(keep=permno date  ret vwretd  exchcd);

where ret is not missing;

date=intnx('month', date, 1)-1;

run;

 

proc sort data=msf;

by date permno;

run;

 

data permnosrankdates;

merge permnosrankdates(in=a) msf(in=b);

by date permno;

if a and b;

run;

Some output was obtained, however i got some very high beta values (for example -1500) when i ran the regressions later which suggests that something went wrong. Could anyone please help me to identify mistake in this code? Did this code lag the values correctly?
In addition, I also tried to lag the data by 60 months by using the regular lag function. This also doesn't work as the data I have is a cross section (panel data) of returns and stocks for the time period (1980-2015). Converting the date to yyyymm format and then lagging by 60 months doesn't help either. 

So my question is: How can I obtain the 60 months lagged returns of securites for a cross section of stock returns that is identified by and id variable=permno and a time varieable date? Could anyone please write the correct sas code for this problem? The data I am using is the cross section of stock returns for given dates. 

 

I would be greatful for any reply and help. I look forward for a reply. 

 

Regards

Akarsh

 

PS: i am using sas version 9.4

1 REPLY 1
mkeintz
PROC Star

Ordinarily you would get no answers to this question becuase you did not provide sample data for SIZE1, for FF, and for your desired outcome.

 

But I believe I am familiar with the particular data sources you are using.

 

You have data in SIZE1 sorted by permno/date in dataset size1, right?  I guess it's CRSP monthly data, where the date is the last trading day of the month.   You want to merge it with FF data, which I believe is also monthly, but is only a single series (i.e. no permno) and apparently uses last calendar date of the month.  If that's the case, then it's best to load the FF into a lookup table (called a hash object below) and just retrieve the needed values by means of hash lookup.  No need for all your sorting.

 

 

data need;
  set size1;
  if _n_=1 then do;
    if 0 then set FF;
    declare hash ffdata(dataset:'ff');
        ff.definekey('date');
        ff.definedata(all:'Y');
        ff.definedone();
  end;
  date=intnx('month',date,0,'end');
  rc=ffdata.find();
  date60=intnx('month',date,-60,'end');
run;

 

 

Notes:

  1. The hash object
    1. Is automatically "retained". That's why it is generated and populated only under the "IF _N_=1" do group.  You only need to create the hash object once.  "IF _N_=1" means to do this only during the first iteration of the data step (which usually means when processing the first observation).
    2. The "if 0 then set ff" statement tells sas to make entries in the program data vector for all the FF vars, but don't bother actually reading in the FF data.
    3. The hash object FFDATA, in this case, just takes in all the FF data:  declare hash ffdata(dataset:'FF') .  The are many attributes of the hash object that I won't go into, but are particulary useful for your task.
    4. It is "keyed" (indexed) on the date var:    ffdata.definekey('date')
    5. The data to be retrieved via that key is all the vars in FF:   ffdata.definedata(all:'Y').  If you only want some of the vars (say vars a b and c), you could just change the declare statement to   declare hash ffdata(dataset:'FF (keep=date a b c)');
  2. Getting end-of-month dates.  You don't need date=intnx('month',date,1)-1  You can use the alignment parameter 'end", as in date=intnx('month',date,0,'end').
  3. rc=ffdata.find() says to get the FF data for the date of the incoming SIZE1 observation.  RC=0 upon a successful lookup.
  4. Data set need now has ...
    1. Monthly data sorted by permno/date
    2. date=last calendar date of the month
    3. all the vars in SIZE1
    4. all the vars in FF for the same month
    5. A 2nd date variable DATE60, which has the date for 60 months prior.

This assumes, of course, that the FF has all the date values found in SIZE1 (after end-of-month adjustment).  Otherwise some FF values will be erroneous in the progarm above.  You would need to explicitly set them to missing whenever RC^=0.  So take a look at NEED before you run the program below to generate WANT.

 

Now you want 60 month lagged returns. For record J of a given permno you can just find record K of the same permno where DATE60 for J = DATE for K.  That's straightforward:

 

data want;
  merge need (in=incurrent
        need (in=in60months keep=date ret vwretd rename=(date=date60 ret=ret60 vwretd=vwretd60));
  by permno date60;
  if incurrent and in60months;
run;

 

  1. Now data WANT is still sorted by permno/date  (and also by permno/date60).
  2. It only has complete records, i.e.for which there are both current and 60-month prior data  (that's the "if incurrent and in60months").
  3. it has 2 new vars:   ret60 an vwretd60 represented the 60 month lagged data.   You can obviously get some lagged FF vars too by adding them to the "keep="  and "rename=" parameters
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 1522 views
  • 0 likes
  • 2 in conversation