Hi All,
I have a dataset which includes many firms. For each firm, there many years. There are only 2 months June and August in each firm-year. For each firm-year-month June (August), I would like to obtain the means of excess return (exret) of the same months and different months in the last 2 to 5 years (year t-2 to t-5).
For instance, first, for firm 1-June 2019, I would like to get the mean of (June 2018, June 2017,...., June 2014). Second, for firm 1-June 2019, I also would like to get the mean of (August 2018,August 2017,....,August 2014). Some firms do not have data for all last 2-5 years. (please see the screenshot of the dataset). I think that I need to run 2 SQL commands, first I tried the codes below to get the mean of same months in the last 2-5 years but unsuccessful.
"proc sql;
create table return_lag_25_c
as select a.*, mean(b.exret) as mean_25_c /* obtain the mean of excess return*/
from price_month2 as a, price_month2 as b
where a.tcode=b.tcode and a.ltd_mo=b.ltd_mo and 2 <= a.ltd_yr - b.ltd_yr<= 5 /* from the last year t-2 to t-5*/
group by a.tcode, a.ltd_yr, a.ltd_mo;
quit; "
/* It is not working-200920. there are 233606 rows in the output while there are only 110,330 in the input dataset price_month2*/
I hope that any one can share with me some tips to resolve this problem.
Thanks,
Chris
You can do it in one query, getting the means as subqueries (not tested):
proc sql;
create table return_lag_25_c
as select main.*,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo=main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_same_month,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo ne main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_different_month
from price_month2 as main
;
quit;
You can do it in one query, getting the means as subqueries (not tested):
proc sql;
create table return_lag_25_c
as select main.*,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo=main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_same_month,
(select mean(exret) from price_month2 as b
where tcode=main.tcode
and ltd_mo ne main.ltd_mo
and ltd_yr between main.ltd_yr-5 and main.ltd_yr-2)
as mean_different_month
from price_month2 as main
;
quit;
Dear Lassen,
It worked like a charm!
Thank you.
Chris
Hi @s_lassen ,
The codes is working perfectly for monthly data. However, when I applied them for daily data and the mean is estimated from lag data t-20 to t-6, it took days to run the codes. If there any way that we can reduce the running time?
Thanks,
Chris
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.