I am currently working on back testing Joseph Piotroski's article Value investing: "The Use of Historical Financial Statement Information to Separate Winners from Losers". I am using Compustat data and CRSP data from Whartons research data services (wrds). The idea is to replicate the research article.
I am having trouble with the SAS code, finding and linking the returns between the two data sets. Returns, which have been defined by: " I measure firm-specific returns as one-year (two-year) buy-and-hold returns earned from the beginning of the fifth month after the firm’s fiscal year-end through the earliest subsequent date: one year (two years) after return compounding began or the last day of CRSP traded returns. If a firm delists, I assume the delisting return is zero. I chose the fifth month to ensure that the necessary annual financial information is available to investors at the time of portfolio formation. I define market-adjusted returns as the buy-and-hold return less the value-weighted market return over the corresponding time period. "
Issue: I am having trouble turning this concept into workable SAS code. CRSP has the returns information and the dates are displayed as ex: 19860131 (Jan 31 1986) and Cstat has the fiscal year end for each firm displayed as a number 1-12. As you will see below, adding 5 months will in some cases move the start of the returns into the next year. These two data sets link firms by a firm identifier code.
Variable descriptions: gvkey= company identifier,Cstat= (Compustat) financials dataset, CRSP = returns dataset, t= fiscal year end, RET = returns, (yymm = year*12 + mm)
This is the concept, compound returns from t+5 to t+16, where t is the fiscal year end. RET is the returns. This is the buy and hold return.
iii) Now, suppose you are doing this for IBM. If IBM has Dec year end, then you would have used 2011 Dec financials from IBM to do the F Score calculation for 2011. This should be merged with the May 2012 record for IBM from CRSP, so you are aligning the Dec 2011 F-Score with the correct returns starting from t+5.
This being prefaced, the SAS code would look something like this?
If month >= 12 then month = month-12;
Confirm it is same firm (gvkey = lag1(gvkey);
Then, merge by gvkey match month;
Date from CSTAT matched with 5 months later on CRSP;
On CRSP match yymm = yymm; (yymm = year*12 + mm)
On CSTAT match yymm = fiscal end yymm + 5;
Ex from earlier,
For IBM 2011, FY end is Dec, yymm = (2011*12) + 12 = 24144
On CRSP, match yymm = year*12+mm,
On CStat, it will be 24149 to match on CRSP
Match May 2012 CRSP data with Cstat Dec 2011
This is a link to a PDF version of the research paper. https://www.chicagobooth.edu/~/media/FE874EE65F624AAEBD0166B1974FD74D.pdf
Hi Carlos,
Thank you for your response.
Looking into the doloop.
Another comment/question for you. (I have attached screenshots below)
My CSTAT data set is annual information also has the fiscal year end variable(FYR). While CRSP data set is monthly returns and does not have fiscal year end variable. Like I mentioned earlier, I need to find the RET(Returns) for 5 months after the FYR ((month of fiscal year end) )month and then for the next 12 months.
When I merge the two data sets I merge them by "matchpermno" which is the company identifier in both sets. I am not sure how to match RET to the fiscal year end month then stagger 5 months from that date to find the returns for the year. Basically, If IBM has Dec year end, then you would have used 2011 Dec financials from IBM to do the F_score calculation for 2011. This should be merged with the May 2012 record for IBM from CRSP, so you are aligning the Dec 2011 F-Score with the correct returns starting from t+5 (May).
CRSP:
CSTAT:
Hello @jkoleti ,
I was playng around your challenge, and I think I need more inputs. But till there , I share a small draft that its possible that you can improve your idea from it.
Hello @jkoleti ,
This time, I read the details on your description text, and I came with the idea of getting the
elapsed time between the date and months after Date.
From here I think you carry on the with the rest of the validations
Regards
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.