Hi Everyone,
At the end of each month t, I want to get the value of maximum return for each stock based on its daily returns from month t. There will be almost 20-25 daily observations in a month, but I am keeping very few in the sample for the convenience. Furthermore, I also want to get the average of two, three, four and five maximum values in a month for each stock too (I put few daily observations; therefore, I understand that the average of four and five maximum returns might not be possible, but I just need the code).
The original data is around 30 million observations. Please find the SAS data below. Based on this data, I estimate the maximum return on for stock ID 1 for 201001 (January 2010), and it should be 0.04135.
Thanks a lot for your help.
Best,
Cheema
data have;
infile cards expandtabs truncover;
input stock date : yymmdd8. ret ;
format date yymmdd8.;
cards;
1 20100107 0.04135
1 20100126 -0.02544
1 20100129 0.03384
1 20100201 0.01089
1 20100202 0.02621
1 20100226 0.01001
1 20100301 -0.0132
1 20100330 -0.0181
1 20100331 -0.00308
1 20100401 0.05394
1 20100429 -0.01077
1 20100430 0.0031
1 20100503 0.00931
1 20100521 0
1 20100531 -0.01427
1 20100601 -0.02718
1 20100617 0.05019
1 20100630 -0.01236
1 20100701 0
1 20100728 0.01193
1 20100730 0.01621
1 20100802 -0.0174
2 20100107 0.06069
2 20100126 -0.06219
2 20100129 0.01989
2 20100201 -0.04551
2 20100202 0.04768
2 20100226 -0.05462
2 20100301 0.02613
2 20100330 -0.04155
2 20100331 0.04336
2 20100401 -0.00536
2 20100429 -0.02561
2 20100430 -0.0166
2 20100503 0.00563
2 20100521 0
2 20100531 0.01119
2 20100601 0
2 20100617 0.01521
2 20100630 0.27384
2 20100701 0.09519
2 20100728 -0.12793
2 20100730 -0.04815
2 20100802 -0.02235
run
;
Cheema
1) sort the dataset by descending order
data have;
set have;
year=(date);
month=month(date);
run;
proc sort data=have;
by id year month descending ret;
run;
2) pick up the most large five obs for each id,after that do PROC SUMMAYR just as I did.
data top_five;
set have;
by id year month;
if first.month then n=0;
n+1;
if n le 5;
run;
So do you wish to output three observations for stock ID 1 for 201001 (January 2010) and only have the maximum value at the last observation in that month or only one observation per month? 🙂
Thanks, only one observation per month for each stock.
data have;
infile cards expandtabs truncover;
input stock date : yymmdd8. ret ;
format date yymmdd8.;
cards;
1 20100107 0.04135
1 20100126 -0.02544
1 20100129 0.03384
1 20100201 0.01089
1 20100202 0.02621
1 20100226 0.01001
1 20100301 -0.0132
1 20100330 -0.0181
1 20100331 -0.00308
1 20100401 0.05394
1 20100429 -0.01077
1 20100430 0.0031
1 20100503 0.00931
1 20100521 0
1 20100531 -0.01427
1 20100601 -0.02718
1 20100617 0.05019
1 20100630 -0.01236
1 20100701 0
1 20100728 0.01193
1 20100730 0.01621
1 20100802 -0.0174
2 20100107 0.06069
2 20100126 -0.06219
2 20100129 0.01989
2 20100201 -0.04551
2 20100202 0.04768
2 20100226 -0.05462
2 20100301 0.02613
2 20100330 -0.04155
2 20100331 0.04336
2 20100401 -0.00536
2 20100429 -0.02561
2 20100430 -0.0166
2 20100503 0.00563
2 20100521 0
2 20100531 0.01119
2 20100601 0
2 20100617 0.01521
2 20100630 0.27384
2 20100701 0.09519
2 20100728 -0.12793
2 20100730 -0.04815
2 20100802 -0.02235
;
run;
proc summary data=have;
by stock date;
format date monyy.;
var ret;
output out=want max=ret_max;
run;
Thanks Ksharp. It will be great if you can please help me with the second part where I want to get the average of two, three, four and five maximum values in a month for each stock too (I put few daily observations; therefore, I understand that the average of four and five maximum returns might not be possible based on the provided dataset, but please make it for the average of three returns in a month and I will edit the code then).
1) sort the dataset by descending order
data have;
set have;
year=(date);
month=month(date);
run;
proc sort data=have;
by id year month descending ret;
run;
2) pick up the most large five obs for each id,after that do PROC SUMMAYR just as I did.
data top_five;
set have;
by id year month;
if first.month then n=0;
n+1;
if n le 5;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.