BookmarkSubscribeRSS Feed
jkoleti
Calcite | Level 5

I am currently working on back testing Joseph Piotroski's article Value investing: "The Use of Historical Financial Statement Information to Separate Winners from Losers". I am using Compustat data and CRSP data  from Whartons research data services (wrds). The idea is to replicate the research article. 

 

I am having trouble with the SAS code, finding and linking the returns between the two data sets. Returns, which have been defined by: " I measure firm-specific returns as one-year (two-year) buy-and-hold returns earned from the beginning of the fifth month after the firm’s fiscal year-end through the earliest subsequent date: one year (two years) after return compounding began or the last day of CRSP traded returns. If a firm delists, I assume the delisting return is zero. I chose the fifth month to ensure that the necessary annual financial information is available to investors at the time of portfolio formation. I define market-adjusted returns as the buy-and-hold return less the value-weighted market return over the corresponding time period. " 

 

Issue: I am having trouble turning this concept into workable SAS code. CRSP has the returns information and the dates are displayed as ex: 19860131 (Jan 31 1986) and Cstat has the fiscal year end for each firm displayed as a number 1-12. As you will see below, adding 5 months will in some cases move the start of the returns into the next year.  These two data sets link firms by a firm identifier code. 

 

Variable descriptions: gvkey= company identifier,Cstat= (Compustat) financials dataset, CRSP = returns dataset, t= fiscal year end, RET = returns, (yymm = year*12 + mm)

 

This is the concept, compound returns from t+5 to t+16, where t is the fiscal year end. RET is the returns.  This is the buy and hold return.

 

  1. i) So, for every month for every firm on CRSP, create 12 additional variables with returns from t+5 to t+16. Ex: if t is 10 (October) I need to find the returns from t+5, 3(March) through next 3 (march) 
  2. ii) Then, get 1+RET each of the 12 months, multiply all 12, and subtract 1 to get one year buy and hold return (BHR) for firm. So, you will add one more variable, the 12 month BHR. Do same BHR for the market portfolio, so you can do market adjusted returns like the paper. So, you have 14 addl vars, 12 monthly rets from t+5 to t=16 and the 2BHRs.

iii) Now, suppose you are doing this for IBM. If IBM has Dec year end, then you would have used 2011 Dec financials from IBM to do the F Score calculation for 2011. This should be merged with the May 2012 record for IBM from CRSP, so you are aligning the Dec 2011 F-Score with the correct returns starting from t+5.

  1. iv) You can use the monthly returns to calculate the portfolio returns (VW and EW) and then do the Jensen and FF regressions to get alphas.

 

This being prefaced, the SAS code would look something like this? 

 

If month >= 12 then month = month-12;

Confirm it is same firm (gvkey = lag1(gvkey);

Then, merge by gvkey match month;

Date from CSTAT matched with 5 months later on CRSP;

On CRSP match yymm = yymm; (yymm = year*12 + mm)

On CSTAT match yymm = fiscal end yymm + 5;

 

Ex from earlier,

For IBM 2011, FY end is Dec, yymm = (2011*12) + 12 = 24144

On CRSP, match yymm = year*12+mm,

On CStat, it will be 24149 to match on CRSP

Match May 2012 CRSP data with Cstat Dec 2011

 

This is a link to a PDF version of the research paper. https://www.chicagobooth.edu/~/media/FE874EE65F624AAEBD0166B1974FD74D.pdf

 

4 REPLIES 4
CarlosSpranger
Obsidian | Level 7
Hello @jkoleti,
I recommend you to read about the data step , % do loop and the merge statement.
Regards
CSB
jkoleti
Calcite | Level 5

Hi Carlos,

Thank you for your response.

Looking into the doloop.

 

Another comment/question for you. (I have attached screenshots below)

My CSTAT data set is annual information also has the fiscal year end variable(FYR). While CRSP data set is monthly returns and does not have fiscal year end variable. Like I mentioned earlier, I need to find the RET(Returns) for 5 months after the FYR ((month of fiscal year end) )month and then for the next 12 months. 

 

When I merge the two data sets I merge them by "matchpermno" which is the company identifier in both sets. I am not sure how to match RET to the fiscal year end month then stagger 5 months from that date to find the returns for the year. Basically, If IBM has Dec year end, then you would have used 2011 Dec financials from IBM to do the F_score calculation for 2011. This should be merged with the May 2012 record for IBM from CRSP, so you are aligning the Dec 2011 F-Score with the correct returns starting from t+5 (May).

 

 

 

 

 

CRSP:

jkoleti_0-1587653910081.png

 

CSTAT:

jkoleti_1-1587654275450.png

 

 

CarlosSpranger
Obsidian | Level 7

Hello  @jkoleti ,

 

I was playng around your challenge, and I think I need more inputs. But till there , I share a small draft that its possible that you can improve your idea from it.

CSB
CarlosSpranger
Obsidian | Level 7

Hello @jkoleti ,

This time, I read the details on your description text, and I came with the idea of getting the
elapsed time between the date and months after Date.

 

From here I think you carry on the with the rest of the validations

 

Regards

CSB

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 783 views
  • 1 like
  • 2 in conversation