Hi All,
I have a dataset which includes many firms. For each firm, there many years. There are only 2 months June and August in each firm-year. For each firm-year-month June (August), I would like to obtain the means of excess return (exret) of the same months and different months in the last 2 to 5 years (year t-2 to t-5).
For instance, first, for firm 1-June 2019, I would like to get the mean of (June 2018, June 2017,...., June 2014). Second, for firm 1-June 2019, I also would like to get the mean of (August 2018,August 2017,....,August 2014). Some firms do not have data for all last 2-5 years. (please see the screenshot of the dataset). I think that I need to run 2 SQL commands, first I tried the codes below to get the mean of same months in the last 2-5 years but unsuccessful.
"proc sql;
create table return_lag_25_c
as select a.*, mean(b.exret) as mean_25_c /* obtain the mean of excess return*/
from price_month2 as a, price_month2 as b
where a.tcode=b.tcode and a.ltd_mo=b.ltd_mo and 2 <= a.ltd_yr - b.ltd_yr<= 5 /* from the last year t-2 to t-5*/
group by a.tcode, a.ltd_yr, a.ltd_mo;
quit; "
/* It is not working-200920. there are 233606 rows in the output while there are only 110,330 in the input dataset price_month2*/
I hope that any one can share with me some tips to resolve this problem.
Thanks,
Chris
You can do it in one query, getting the means as subqueries (not tested):
proc sql;
create table return_lag_25_c
as select main.*,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo=main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_same_month,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo ne main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_different_month
from price_month2 as main
;
quit;
You can do it in one query, getting the means as subqueries (not tested):
proc sql;
create table return_lag_25_c
as select main.*,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo=main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_same_month,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo ne main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_different_month
from price_month2 as main
;
quit;
Dear Lassen,
It worked like a charm!
Thank you.
Chris
Hi @s_lassen ,
The codes is working perfectly for monthly data. However, when I applied them for daily data and the mean is estimated from lag data t-20 to t-6, it took days to run the codes. If there any way that we can reduce the running time?
Thanks,
Chris
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.