dear all,
i have the dataset of stock returns for 163 stocks in daily frequency, i have to compute cumulative return for 50 preceding days and 200 preceding days for all the 163 stock returns. I need 50 days cumulative return in one file and 200 days cumulative returns in another file.
my present data file is in this format
Date | C1 | C2 | C3 | ……………… | C163 |
5/4/2010 | 0.034732 | -1.30527 | 1.414732 | 3.014732 | |
6/4/2010 | 0.204732 | 0.124732 | -0.32527 | 2.304732 | |
7/4/2010 | 2.394732 | 0.174732 | 1.314732 | -0.72527 | |
8/4/2010 | -1.26527 | -2.06527 | -0.22527 | -0.83527 | |
9/4/2010 | -0.48527 | 0.384732 | -0.28527 | 1.584732 | |
######## | -0.74641 | -1.14641 | -0.91641 | 3.173589 | |
######## | -0.90641 | 3.443589 | 0.973589 | -0.46641 | |
######## | -0.47641 | 0.373589 | -0.59641 | 1.833589 | |
######## | -0.53641 | -1.36641 | -0.73641 | -0.82641 | |
######## | -1.46617 | -2.06617 | 1.773835 | 3.353835 | |
######## | 0.133835 | 0.743835 | 6.183835 | -2.59617 | |
######## | 1.143835 | 0.083835 | 3.933835 | 2.473835 | |
######## | -1.31617 | 0.323835 | 1.743835 | 3.173835 | |
######## | -1.81617 | 0.593835 | 1.713835 | -0.57617 | |
######## | 0.533835 | 0.693835 | 1.893835 | -0.22617 | |
######## | -0.52617 | 0.883835 | 1.043835 | -1.14617 | |
######## | 0.543835 | -1.99617 | -2.63617 | -0.79617 | |
######## | -1.19617 | -0.86617 | -0.35617 | -0.75617 | |
######## | 0.973835 | -1.02617 | 0.103835 | 1.453835 | |
3/5/2010 | -0.35601 | 1.94399 | 0.35399 | -1.64601 | |
4/5/2010 | -2.72601 | 0.48399 | -0.33601 | -3.02601 | |
5/5/2010 | 0.73399 | -0.80601 | 0.25399 | 1.29399 | |
6/5/2010 | -1.12601 | 0.29399 | 0.02399 | -2.11601 | |
7/5/2010 | -1.33601 | -0.30601 | -0.65601 | -2.90601 | |
######## | 2.813835 | 0.823835 | 3.373835 | 2.243835 | |
######## | -0.40617 | -0.05617 | 0.223835 | -2.31617 | |
######## | -0.83617 | 0.253835 | -0.16617 | 3.283835 | |
######## | 1.443835 | 2.953835 | 0.343835 | 4.833835 | |
######## | 0.213835 | -2.05617 | 2.253835 | -2.18617 | |
######## | -3.32632 | -1.35632 | -2.28632 | 3.863679 | |
######## | 0.163679 | 0.353679 | 0.723679 | -3.56632 | |
######## | -2.23632 | 0.043679 | -1.69632 | -3.91632 | |
######## | -0.06632 | 0.243679 | 2.323679 | 3.513679 | |
######## | 1.753679 | -0.96632 | -0.64632 | -2.73632 | |
######## | -0.46942 | 0.150578 | 0.390578 | -1.56942 | |
######## | -1.60942 | -4.41942 | -1.93942 | -2.16942 | |
######## | -1.35942 | 2.830578 | 0.410578 | -0.08942 | |
######## | -1.65942 | 1.300578 | -6.48942 | 0.780578 | |
######## | 1.150578 | 2.230578 | -0.80942 | 1.870578 | |
######## | -1.22004 | -1.90004 | 0.509959 | 0.019959 | |
1/6/2010 | 0.169959 | 1.199959 | -1.32004 | -2.67004 | |
2/6/2010 | 0.889959 | 0.729959 | 1.199959 | 0.069959 | |
3/6/2010 | 1.959959 | -1.63004 | 1.529959 | 5.539959 | |
4/6/2010 | -0.20004 | -0.33004 | 0.309959 | 1.639959 | |
7/6/2010 | 1.039804 | -1.9902 | -2.6102 | -2.2102 |
required output format for 50 days cumulative returns is as follows
Date | C1_CR50 | C2_CR50 | C3_CR50 | ……………… | C163_CR50 |
16/6/2010 | 0.210412 | 0.189752 | -1.72066 | -0.04066 | |
17/6/2010 | 0.789863 | 0.769204 | -0.19066 | 0.31934 | |
18/6/2010 | -0.2325 | -0.25316 | -0.07066 | -0.70066 | |
21/6/2010 | 1.708799 | 1.68814 | 2.44934 | -0.44066 | |
22/6/2010 | -0.68886 | -0.70952 | -2.45066 | -0.35066 | |
23/6/2010 | 0.124064 | 0.103404 | 1.03934 | -0.19066 | |
24/6/2010 | -0.04792 | -0.06857 | -0.41066 | -0.27066 | |
25/6/2010 | -0.9736 | -0.99426 | -0.62066 | -0.21066 | |
28/6/2010 | 1.21576 | 1.19541 | 0.859649 | 0.169649 | |
29/6/2010 | -1.46089 | -1.48124 | -0.40035 | 0.089649 | |
30/6/2010 | 1.066371 | 1.046021 | 0.999649 | 0.649649 | |
1/7/2010 | -1.15678 | -1.17713 | -0.89035 | -0.46035 | |
2/7/2010 | -0.27268 | -0.29303 | -0.86035 | -0.17035 | |
5/7/2010 | -0.02292 | -0.04357 | 0.409342 | 0.009342 | |
6/7/2010 | 1.00999 | 0.989331 | -0.98066 | -0.72066 | |
7/7/2010 | -0.91072 | -0.93138 | -0.32066 | -0.28066 | |
8/7/2010 | 1.05809 | 1.037432 | -1.01066 | 0.689342 | |
9/7/2010 | 1.04421 | 1.023551 | -0.96066 | -0.27066 | |
12/7/2010 | 0.569144 | 0.548332 | -1.21081 | -0.09081 | |
13/7/2010 | 0.327348 | 0.306536 | -0.84081 | 0.139188 | |
14/7/2010 | -0.26885 | -0.28966 | -1.77081 | 0.779188 | |
15/7/2010 | -0.13748 | -0.1583 | 0.689188 | -0.03081 | |
16/7/2010 | 0.281268 | 0.260456 | 0.149188 | 0.519188 | |
19/7/2010 | -0.13821 | -0.16026 | -0.05205 | 0.607952 | |
20/7/2010 | -0.34311 | -0.36516 | -0.38205 | -0.19205 | |
21/7/2010 | 0.582318 | 0.56027 | 1.917952 | 0.927952 | |
22/7/2010 | 0.785888 | 0.76384 | -1.08205 | 1.737952 | |
23/7/2010 | 0.1313 | 0.109253 | 0.417952 | -0.69205 | |
26/7/2010 | -0.5613 | -0.58335 | 0.457952 | 2.447952 | |
27/7/2010 | 0.221215 | 0.199167 | 3.017952 | 3.057952 | |
28/7/2010 | -0.61045 | -0.6325 | 0.137952 | 0.657952 | |
29/7/2010 | 0.21006 | 0.188012 | 0.067952 | -1.44205 | |
30/7/2010 | -0.76649 | -0.78853 | -1.11205 | -0.02205 | |
2/8/2010 | 1.186207 | 1.163233 | 0.347025 | 4.717025 | |
3/8/2010 | 0.145338 | 0.122364 | -0.58297 | 2.087025 | |
4/8/2010 | 0.518915 | 0.49594 | -1.21297 | -1.63297 | |
5/8/2010 | -0.38021 | -0.40319 | -0.40297 | -0.46297 | |
6/8/2010 | -0.14422 | -0.16719 | 0.947025 | 0.447025 |
required output format for 200 days cumulative returns is
Date | C1_CR200 | C2_CR200 | C3_CR200 | ……………… | C163_CR200 |
16/6/2010 | 0.064732433 | -0.625267567 | 2.274732433 | -0.545267567 | -0.545267567 |
17/6/2010 | -1.235267567 | 0.034732433 | -1.095267567 | -0.985267567 | -0.985267567 |
18/6/2010 | 2.254732433 | -0.775267567 | 11.27473243 | -0.635267567 | -0.635267567 |
21/6/2010 | -0.185267567 | 0.034732433 | -4.445267567 | -1.545267567 | -1.545267567 |
22/6/2010 | 0.484732433 | -0.915267567 | 0.564732433 | 2.174732433 | 2.174732433 |
23/6/2010 | -1.226411246 | 0.963588754 | -1.556411246 | -0.466411246 | -0.466411246 |
24/6/2010 | 0.193588754 | -0.626411246 | 0.953588754 | -0.376411246 | -0.376411246 |
25/6/2010 | 0.613588754 | 5.763588754 | -3.166411246 | -1.456411246 | -1.456411246 |
28/6/2010 | -0.976411246 | 0.123588754 | -2.736411246 | 0.713588754 | 0.713588754 |
29/6/2010 | -0.986165446 | -1.806165446 | 1.353834554 | -1.646165446 | -1.646165446 |
30/6/2010 | 1.433834554 | 3.423834554 | 0.523834554 | 2.743834554 | 2.743834554 |
1/7/2010 | 1.623834554 | 0.613834554 | 1.593834554 | -0.646165446 | -0.646165446 |
2/7/2010 | -0.756165446 | -0.126165446 | -1.136165446 | 1.963834554 | 1.963834554 |
5/7/2010 | 0.063834554 | 0.293834554 | -2.216165446 | 0.253834554 | 0.253834554 |
6/7/2010 | 1.103834554 | -2.476165446 | 1.213834554 | -0.366165446 | -0.366165446 |
7/7/2010 | 1.753834554 | 0.583834554 | -1.696165446 | -0.366165446 | -0.366165446 |
8/7/2010 | 0.103834554 | 0.103834554 | -2.486165446 | -0.996165446 | -0.996165446 |
9/7/2010 | 0.263834554 | -0.436165446 | 0.473834554 | -1.096165446 | -1.096165446 |
12/7/2010 | -2.396165446 | 1.583834554 | -2.746165446 | 8.683834554 | 8.683834554 |
13/7/2010 | -0.516010208 | -1.196010208 | -1.816010208 | 1.063989792 | 1.063989792 |
14/7/2010 | -3.786010208 | -1.616010208 | -3.086010208 | 0.153989792 | 0.153989792 |
15/7/2010 | 0.323989792 | -0.526010208 | 4.893989792 | 2.953989792 | 2.953989792 |
16/7/2010 | -3.706010208 | -0.726010208 | -2.396010208 | 4.463989792 | 4.463989792 |
19/7/2010 | -0.506010208 | -2.286010208 | -3.556010208 | -3.306010208 | -3.306010208 |
20/7/2010 | 2.803834554 | -0.156165446 | 5.413834554 | 1.563834554 | 1.563834554 |
21/7/2010 | -3.246165446 | -1.346165446 | -1.906165446 | 2.013834554 | 2.013834554 |
22/7/2010 | -1.416165446 | -1.086165446 | 2.793834554 | 2.203834554 | 2.203834554 |
23/7/2010 | -0.066165446 | 2.693834554 | 1.563834554 | 1.853834554 | 1.853834554 |
26/7/2010 | -3.356165446 | -1.836165446 | -3.056165446 | -1.336165446 | -1.336165446 |
27/7/2010 | -0.256321002 | -1.246321002 | 1.803678998 | -0.906321002 | -0.906321002 |
28/7/2010 | 2.213678998 | -0.336321002 | -0.876321002 | -2.116321002 | -2.116321002 |
29/7/2010 | -3.126321002 | -2.156321002 | -5.226321002 | -3.156321002 | -3.156321002 |
30/7/2010 | -0.016321002 | -2.966321002 | -2.536321002 | -3.736321002 | -3.736321002 |
2/8/2010 | 1.133678998 | 2.203678998 | -0.796321002 | 0.473678998 | 0.473678998 |
3/8/2010 | 1.160577594 | -0.889422406 | -0.489422406 | -2.229422406 | -2.229422406 |
4/8/2010 | -1.799422406 | -2.479422406 | 0.850577594 | -4.619422406 | -4.619422406 |
5/8/2010 | 0.980577594 | 5.610577594 | 2.970577594 | 3.840577594 | 3.840577594 |
6/8/2010 | 2.340577594 | 8.000577594 | 0.440577594 | 1.250577594 | 1.250577594 |
please guide in writing the SAS code
Again, we request you provide the data as SAS data step code (instructions), as we can't work from the pasted data you have provided. You will get faster answers when you provide the data in this form. We should not have to repeatedly ask for this, we are trying to help you but you have to help us as well.
WRDS does support proc expand, and you could use it by mean of 163 statements in the proc.
But if you don't have PROC EXPAND (from the SAS/ETS module), then the data step below would work for the data structure you describe.
BTW, I would agree with Reeza's observation that having a single column of returns, sorted by stock and date would be much better for proc expand (one statement instead of 163). But the structure you have might actually be better if at some point you are looking for returns on multi-stock portfolios, when there is periodic rebalancing.
In the absence of sample data in the form of a working data step, this code is untested:
data want200 (drop=c1-c163 s);
set have;
array current_ret{163} c1-c163 ;
array logsum200 {163} _temporary_; /*Array of 200-day sums of log(1+daily return) */
array roll200 {163} roll200_c1-roll200_c163 ; /*Desired result variables */
do S=1 to 163; /*For each stock ... */
logsum200{S} + log(1+current_ret{s}) + (-coalesece(lag200(log(1+current_ret{s})),0); /*... update logsum200 */
roll200{S}=exp(logsum200{S}) - 1; /* get coresponding 200-day return*/
end;
if _n_>=200 ;
run;
Notes:
Assuming I understood your question and there are no gap between two date. data have; call streaminit(123); do stock=1 to 160; do date='01jan2015'd to '01jan2020'd; c1=rand('uniform'); c2=rand('uniform'); output; end; end; format date date9.; run; data want; set have; by stock; array x1{50} _temporary_; array x2{50} _temporary_; if first.stock then do;n=0;call missing(of x1{*} x2{*});end; n+1; i=mod(n,50); x1{i+1}=c1; x2{i+1}=c2; C1_CR50=sum(of x1{*}); C2_CR50=sum(of x2{*}); drop n i; run;
I am not quite sure I understand your output data: both tables with cumulative returns start 50 days after the first date, but the values are different.
One way to calculate the cumulative returns for many variables could be to use a two-dimensional array:
data want50;
set have;
array returns50 (0:50;1:163) 8 _temporary_;
array returns(*) c1-c163;
array cumulative(*) 8 cr50_c1-cr50_c163;
retain cr50_c1-cr50_C163;
do _I_=1 to dim(returns);
returns50(mod(_N_-1,51),_I_)=returns(_I_);
cumulative(_I_)+returns(_I_);
cumulative(_I_)+-returns50(mod(_N_,51),_I_));
end;
if _N_>=50;
drop _I_ c1-c163;
run;
So, apparently we are adding the day's returns and then subtracting the next day's returns from the cumulative return. How does that work? It works because the next day has not been read yet, so if there is anything there, it is the the return 50 days before that is subtracted. Note that first index of the two-dimensional array is zero-based, so that actual number of elements is 51*163 - as the return that we need to subtract is before the 50-day period.
I renamed your output variables so that they could be named for an array.
To calculate the 200 days cumulative return, change the number 50 to 200, and the number 51 to 201.
Although the OP didn't define what is meant by "cumulative returns", it's almost certainly not a simple sum of returns, but rather the compound return from
compound_return = (1+return1) *(1+return2) *...*(1+return50) - 1
or
compound_return = exp{log(1+return1) +...(log(return50)} - 1
So you'll probably have to store log(1+return{s,d}) in you rolling history array - where s indexes stock and d indexes day.
However, the approach of using a rolling history array, to be summed with every daily advance, does a lot more computing than using a rolling sum, updated by adding the most recent log(1+return) and subtracting the 50th preceding log(1+return). The bigger the rolling window, the more computing required by the rolling history approach. After all, returns for day number i will added 50 separate times, for day i through day i+49 - for a 50 day window. The rolling sum approach uses day i returns only twice - once to add it for day i and once to subtract it for day i+50 - no matter what size the window is.
Of course the rolling sum approach can introduce more computer precision error than the rolling history approach.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.