Hi, I am trying to replicate "Asset Pricing Models and Financial Market Anomalies by Avramov & Chordia". It require me to include Data "having (Return) observation available for current and last 36 month (Return) observation".
I have find two codes but those aren't what I am looking for.
The First code remove all firms with less than 36 (Return) observation as whole, but don't account specifically LAST 36 MONTH:
proc sql;
create table mergeRET1 as select * from merge2
group by permno having count(ret) ge 36;
quit;
The Second code I find use Macro (complete macro code in the end) to create accumulated returns of past 12 month return. Adding following code lines will delete first 36 month (Return) observation but for every firm in the sample, which I dont want either:
if first.permno then count=1;
do i=count to 36; lagret[i] = .;
end;
count+1;
if lagret36 = . then delete;
What I want is, to remove firm(permno) which don't have PREVIOUS 36 months returns. Put it differently, I want firm(permno) having CONSECTIVE PREVIOUS 36 month return to remain intact.
Complete Macro code for using Accumulated Lag Return of 12 months:
proc sort data=merge1 ; by permno descending date; run;
%macro calcret;
data merge2; set merge1; by permno;
lret=log(ret+1); *Take natural logs of the firm return (ret);
array lagret[36] lagret1-lagret36; * Define arrays to create the lagged values;
%do j=1 %to 36; * macro "do-loop" to create the 12 necessary lags;
lagret&j=lag&j(lret);
%end;
/* following statements set to missing lagged values that are reading the data corresponding to the previous PERMNO */
if first.permno then count=1;
do i=count to 36; lagret[i] = .;
end;
count+1; if lagret36 = . then delete;
* Calculate the cumulative returns by adding the lags of the returns;
ret12= (exp( sum(of lret lagret2-lagret12)) -1)*100;
drop l: count i; *Drop unnecessary variables;
%mend calcret; * End of macro;
%calcret;
run;
Couldn't you get what you want with something like:
proc sql noprint; create table mergeRET1 as select * from merge2(where=(date between intnx('month',today(),-35,'b') and intnx('month',today(),0,'e'))) group by permno having count(ret) eq 36 and sum(month(date)) eq 234 ; quit;
Art, CEO, AnalystFinder.com
Thanks alot for your reply.
I tried to run the codes but output file is generated with zero observations.
I didn't understand the intitution behind number '234' so I ran the same code without last part "and sum(month(date)) eq 234" and still got zero observation in output file.
I would be really greatful if you could review the codes.
The codes I run. I replace merge2 with merge1 dataset and in line 121, change "eq 36" with "ge 36" as I percieve it will generate output having more than 36 consective observation. Codes with "eq 36" also generate zero observation.
119 create table mergeRET1 as select * from merge1
120 (where=(date between intnx('month',today(),-35,'b') and intnx('month',today(),0,'e')))
121 group by permno having count(ret) ge 36;
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.MERGERET1 created, with 0 rows and 13 columns.
121! * and sum(month(date)) eq 234;
122 quit;
The 234 is the sum of three 1's (January's), three 2's (Feb), ... three 12's (dec). That together with date between today and 35 months prior ensures exactly 36 consecutive months.
The code, per your specs (as I understood them), only accepted records for a permno if there was a record for May2017 and one record (and only one record) for each of the 35 months preceding May2017.
The 234 sum the month numbers, where jan=1, feb=2, etc.. Thus, 234 is what you would get if the summed the month values for 36 months.
Of course, if your 36 month don't include the current month, or if a permno can have more than one record for a given month, the code would have to be changed to accomodate the new specs.
Art, CEO, AnalystFinder.com
You want all consecutive monthly records for any permno having at least 36 consecutive months of data.
Let's assume your merge2 data is sorted by PERMNO, DATE, where DATE is the last calendar date of the month:
data want (drop=_:);
merge have
have (firstobs=36 keep=permno date rename=(permno=_permno36 date=_date36));
retain _countdown;
if lag(permno)^=permno then call missing(_countdown,spell);
if permno=_permno36 and _date36=intnx('month',date,35,'end') then _countdown=36;
else _countdown=_countdown-1;
if _countdown>0;
if _countdown>lag(_countdown) then spell+1;
run;
This program:
I think, in my earlier answer, I misread your request. You don't want more than 36 months, you just want the MOST RECENT consecutive 36 months for a given permno.
data want(drop=_:);
do _N=1 by 1 until (last.permno);
set have (keep=permno);
by permno;
merge have (keep=date)
have (firstobs=36 keep=permno date rename=(permno=_permno36 date=_date36));
if permno=_permno36 and intnx('month',date,35,'end')=_date36 then _win_start=_N;
end;
if _win_start^=. then _win_end=_win_start+35;
do _i=1 to _N;
set have;
if _win_start<=_i and _i<=_win_end then output;
end;
run;
This program:
This program assumes that dataset have is a month file sorted by permno/date, and that all dates are end-of-month values.
Actually I want both. Sorry for unable to clear it on the first place.
I want my data to include firms which have return available for last 36 month.
so eg:
Firm | Total Obs | Jan'01 | Feb'01 | Mar'01 | Apr'01 | May'01 | Jun'01 | Jul'01 | …………. | Jan'03 | Feb'03 | Mar'03 | Apr'03 | May'03 | Jun'03 | Jul'03 | Aug'03 | Sep'03 | Oct'03 | Nov'03 | Dec'03 | Jan'04 | Feb'04 | Mar'04 | Apr'04 | May'04 | Jun'04 | Jul'04 | Aug'04 | Sep'04 | Oct'04 | Nov'04 | Dec'04 |
A | 28 | 0.5 | 0.6 | 0.7 | 0.7 | 1.0 | 0.9 | 0.6 | …………. | 0.8 | 0.8 | 0.6 | 0.0 | ||||||||||||||||||||
B | 36 | 0.4 | 0.8 | 0.9 | 0.7 | 0.3 | 0.2 | 0.9 | …………. | 0.3 | 0.8 | 0.9 | 0.8 | 0.2 | 0.2 | 0.5 | 0.4 | 0.4 | 0.6 | 0.7 | 0.4 | ||||||||||||
C | 47 | 1.0 | 0.5 | 0.6 | 0.3 | 0.4 | 0.4 | …………. | 0.9 | 0.3 | 0.0 | 0.3 | 0.4 | 0.9 | 0.0 | 0.9 | 0.4 | 0.6 | 0.0 | 0.9 | 0.9 | 0.9 | 0.2 | 0.2 | 0.8 | 0.5 | 0.3 | 0.5 | 0.8 | 0.8 | 0.7 | 0.2 | |
D | 43 | 0.4 | 0.7 | 0.4 | 0.1 | 0.3 | 0.8 | 0.7 | …………. | 0.6 | 0.3 | 0.5 | 1.0 | 0.5 | 0.6 | 0.1 | 0.5 | 0.1 | 0.1 | 0.8 | 0.1 | 0.3 | 0.7 | 0.6 | 0.0 | 0.9 | 1.0 | 0.7 | 0.8 | 0.6 | 0.7 | 0.1 | |
Included | B & C | ||||||||||||||||||||||||||||||||
Not included | A & D |
So I want if the firm's return for previous 36 month is not available, the it should be dropped and not included. but if a firm is have previous 36 month return, it should be included with its all values without losing any observation.
I hope I made myself clear now. Please let me know if there is still anything that create confusion
Previous 36 months from when? And, regardless of your answer to 'from when', that month and the previous 35 months, or just the 36 months prior to 'that' date?
Art, CEO, AnalystFinder.com
Its not for a particular date. For each monthly (monthly ending) observation of firm (PERMNO) to be included, it should have previous 36 month of return.
as context " return in the current month, t, and over the past 36 months has to be available "
So each firm have return values for certain period, but only those return values of firm will be included in output which should have previous 36 month return values.
My data contains montly (month closing dates) values starting from July 1964 to Dec 2001. But many firms enter and leaves my sample within this time period
Still not sure if I understand. The following looks at the most recent date for a given permno, then takes that record along with the previous 35 months data. If the data aren't available for all 36 months, the permno's records are deleted:
data merge2; informat date date9.; format date date9.; input permno date ret; cards; 1 05may2005 50 1 05apr2005 50 1 05mar2005 50 1 05feb2005 50 1 05jan2005 50 1 05dec2004 50 1 05nov2004 50 1 05oct2004 50 1 05sep2004 50 1 05aug2004 50 1 05jul2004 50 1 05jun2004 50 1 05may2004 50 1 05apr2004 50 1 05mar2004 50 1 05feb2004 50 1 05jan2004 50 1 05dec2003 50 1 05nov2003 50 1 05oct2003 50 1 05sep2003 50 1 05aug2003 50 1 05jul2003 50 1 05jun2003 50 1 05may2003 50 1 05apr2003 50 1 05mar2003 50 1 05feb2003 50 1 05jan2003 50 1 05dec2002 50 1 05nov2002 50 1 05oct2002 50 1 05sep2002 50 1 05aug2002 50 1 05jul2002 50 1 05jun2002 50 1 05may2002 50 1 05apr2002 50 1 05mar2002 50 1 05feb2002 50 1 05jan2002 50 2 05may2017 50 2 05apr2017 50 2 05mar2017 50 2 05feb2017 50 2 05jan2017 50 2 05dec2016 50 2 05nov2016 50 2 05oct2016 50 2 05sep2016 50 2 05aug2016 50 2 05jul2016 50 2 05jun2016 50 2 05may2016 50 2 05apr2016 50 2 05mar2016 50 2 05feb2016 50 2 05jan2016 50 2 05dec2015 50 2 05nov2015 50 2 05aug2015 50 2 05jul2015 50 2 05jun2015 50 2 05may2015 50 2 05apr2015 50 2 05mar2015 50 2 05feb2015 50 2 05jan2015 50 2 05dec2014 50 2 05nov2014 50 2 05oct2014 50 2 05sep2014 50 2 05aug2014 50 2 05jul2014 50 2 05jun2014 50 2 05may2014 50 2 05apr2014 50 2 05mar2014 50 2 05feb2014 50 2 05jan2014 50 3 05may2017 50 3 05apr2017 50 3 05mar2017 50 3 05feb2017 50 3 05jan2017 50 3 05dec2016 50 3 05nov2016 50 3 05oct2016 50 3 05sep2016 50 3 05aug2016 50 3 05jul2016 50 3 05jun2016 50 3 05may2016 50 3 05apr2016 50 3 05mar2016 50 3 05feb2016 50 3 05jan2016 50 3 05dec2015 50 3 05nov2015 50 3 05oct2015 50 3 05sep2015 50 3 05aug2015 50 3 05jul2015 50 3 05jun2015 50 3 05may2015 50 3 05apr2015 50 3 05mar2015 50 3 05feb2015 50 3 05jan2015 50 3 05dec2014 50 3 05nov2014 50 3 05oct2014 50 3 05sep2014 50 3 05aug2014 50 3 05jul2014 50 3 05jun2014 50 3 05may2014 50 3 05apr2014 50 3 05mar2014 50 3 05feb2014 50 3 05jan2014 50 ; proc sql noprint; create table temp as select *,max(date) as maxdate from merge2 group by permno ; create table mergeRET1 as select * from temp (where=(intnx('month',maxdate,-35,'b')<=date<=intnx('month',maxdate,0,'e'))) group by permno having count(ret) eq 36 and sum(month(date)) eq 234 ; quit;
Art, CEO, AnalystFinder.com
I am really sorry as am very newbie to SaS. Macro and SQL are pretty hard for me to understand. what I understand from this program is in first step it create maximum date and in second step it group permno with count of ret.
I couldn't properly understand the (where) part. Can U plz tell me little bit about this step (where=(intnx('month',maxdate,-35,'b')<=date<=intnx('month',maxdate,0,'e')))
it should output for month 36 , 37, 38, 39 and 40th month as those would be the month having past 36 (consective) return observations
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.