turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Creating rolling 30-day returns variable

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-23-2014 04:44 PM

Hello All,

I would like to create a variable that, for each day, creates a variable that represents the following 30 days of returns. The data takes the following form:

Firm Date Ret

A 1/1/11 .1

A 1/2/11 .01

A 1/3/11 .1

A 1/4/11 .2

.

.

.

.

B 1/1/11 .05

B 1/2/11 .1

.

.

.

So the idea is for the 1/1/11 observation for Firm A, the variable would return the cumulative return, starting that day, and going forward one day. Just to make it simple, we could use addition, rather than a compounding return: .1 + .01 + .1 + .2 and so on for thirty days. And for the 1/2/11 observation for Firm A, it would be: .01 + .1 + .2 and so on for thirty more days.

And this would continue with all the firms that I have.

Any help would be greatly appreciated. I am going to be downloading a small file for you guys to play with if you need data to work out the problem.

Thanks!

John

Accepted Solutions

Solution

09-27-2014
08:21 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2014 08:21 PM

Try this slight modification:

**proc sql;**

**create table want as**

**select **

** a.firm, **

** a.date, **

** sum(b.ret) as sum_ret, **

** exp(sum(log(1+b.ret)))-1 as compo_ret ,**

** count(b.ret) as n**

**from have as a**

**left join have as b**

**on a.firm=b.firm and b.date-a.date between 0 and 30**

**group by a.firm, a.date;**

**quit;**

You could also add the clause **having n > 30** at the end of the query to keep conplete periods only.

PG

PG

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-23-2014 09:51 PM

Obvious question: What should happen to the later records for each firm, where there are less than 30 subsequent days of data?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-23-2014 09:55 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-24-2014 04:20 AM

Should work, but requires a merge:

Data A (Keep=Firm Date R);

Do Firm_Nr=1 To 3;

Do Time=0 To 50;

Firm=Byte(Firm_Nr+64);

Date=IntNX('day',"01JAN2000"d,Time);

R=Normal(1)*0.02+0.03;

Output;

End;

End;

Format Date Date9.;

Run;

%Let Periods=30;

Data B;

Retain Total;

Set A;

Date=IntNX('day',Date,-&Periods.);

By Firm;

If First.Firm Then Do;

Total=0;

Count=0;

End;

Count+1;

Drop_Obs=Lag&Periods.(R);

If Count gt &Periods Then Total=Sum(Total,R,-Drop_Obs);

Else Total=Sum(Total,R);

Run;

Data C;

Merge A (in=inA) B (Keep=Firm Date Total in=inB);

By Firm Date;

If inA;

Run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-23-2014 10:37 PM

A SQL self join is a quick way, this will return an invalid number if you have less than 30 days of records. You could add a count to keep track of how many records and either eliminate them or set them to missing in the same query if you wanted.

(untested)

proc sql;

create table want as

select a.firm, a.date, sum(b.ret) as cum_ret

from have as a

left join have as b

on b.date-a.date between 0 and 30

order by a.firm, a.date;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2014 08:04 PM

Thank you, as always, for your help. For some reason, the code that you wrote is "hanging," meaning that I submit it and then it just sits there, without giving me an error. I don't really know what is going on, but I really like the idea behind your code, as it seems very intuitive to me.

Do you have any idea why this may be?

Thanks,

John

Solution

09-27-2014
08:21 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2014 08:21 PM

Try this slight modification:

**proc sql;**

**create table want as**

**select **

** a.firm, **

** a.date, **

** sum(b.ret) as sum_ret, **

** exp(sum(log(1+b.ret)))-1 as compo_ret ,**

** count(b.ret) as n**

**from have as a**

**left join have as b**

**on a.firm=b.firm and b.date-a.date between 0 and 30**

**group by a.firm, a.date;**

**quit;**

You could also add the clause **having n > 30** at the end of the query to keep conplete periods only.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2014 08:37 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-27-2014 09:40 PM

No! that was a typo. I corrected it.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-25-2014 06:32 AM

John,

Let us have a small number of dates. There are 6 days for A and

4 for B. It can be made to work with 30 days or for any number of days.

Just change the macro variable, &days, to your context. Also, you need

to give the expected maximum number of dates. For instance, in the

following data set, Firm A has the maximum dates(6).

An Array is used to save the RET values for a Firm at a time to

build up the cumulative sum of RETs. To be efficient, the sums

beyond &days, are made in such a way that one subtraction and

one addition are made for every day thereafter.

As sums are cumulatively made and when the number of observations

forward are lesser than &days, partial sums are made, which you can take

or ignore. Once a Firm is finished, the next Firm is processed in

a DoW-loop.

data have;

informat Date mmddyy8.;

format Date date10.;

input Firm $ Date Ret;

datalines;

A 1/1/11 .1

A 1/2/11 .01

A 1/3/11 .1

A 1/4/11 .2

A 1/5/11 .3

A 1/6/11 .2

B 1/1/11 .05

B 1/2/11 .1

B 1/3/11 .2

B 1/4/11 .1

;

run;

proc sort data = have;

by Firm ;

run;

%let days = 3;

%let maxdates = 6;

data need;

array k[&maxdates] _temporary_;

do ind = 1 by 1 until(last.Firm);

set have;

by Firm;

k[ind] = Ret;

end;

sum_Ret = 0;

* Find first sum for &days ;

do ind = 1 to &days;

sum_Ret + k[ind];

end;

do ind = 1 by 1 until(last.Firm);

set have;

by Firm;

if ind = 1 then output;

* Find subsequent sums by adjusting SUM_RET (removing and adding RET);

else do;

sum_Ret +- k[ind - 1];

if ind <= (dim(k) - &days + 1) then sum_Ret ++ k[ind + &days - 1];

output;

end;

end;

call missing(of k

drop ind;

run;

Muthia Kachirayan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-28-2014 08:50 AM

I have been playing with the code that you each posted and have still not been able to make it work. I have gone back and attached an exerpt of the dataset that I am working with. The entire dataset is large, something like 700MB and has 11mm observations.

When I run the following code (obviously calling my own libraries, etc) the code just sort of sits there and does nothing.

proc sql;

create table mylib.twt2_2 as

select t1.CUSIP,

t1.DATE,

sum(t2.RET) as cum_ret,

exp(sum(log(1+t2.RET)))-1 as compo_ret,

count(t2.RET) as n

from mylib.twt2_1 as t1

left join mylib.twt2_1 as t2

on t2.date-t1.date between 0 and 30 & (t1.CUSIP = t2.CUSIP)

group by t1.CUSIP, t1.date;

quit;

Could you guys have a look at the dataset that I have attached and see what may be going wrong?

Thanks!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-28-2014 10:11 AM

I do not see your data.

By the way did you try the ARRAY approach I posted. If you have problems to understand/use the ARRAY approach, I can help you as much as I can.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-28-2014 12:30 PM

What version of SAS are you using?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-28-2014 01:24 PM

I am not sure what is your BYVar in TEST1. I am assuming it is STOCK_SYMBOL. If it is not you may replace it by your choice. Here is the Array way. There are 2265 rows with 'A' which is more than the next symbol 'ABI'.

I have not checked for the gaps in the dates. It is better you pre-process the data set and fill the gaps with suitable value for RET before running this program.

%let days = 30;

%let maxdates = 2265;

data need;

array k[&maxdates] _temporary_;

do ind = 1 by 1 until(last.stock_symbol);

set test1;

by stock_symbol;

k[ind] = Ret;

end;

sum_Ret = 0;

* Find first sum for &days ;

do ind = 1 to &days;

sum_Ret + k[ind];

end;

do ind = 1 by 1 until(last.stock_symbol);

set test1;

by stock_symbol;

if ind = 1 then output;

* Find subsequent sums by adjusting SUM_RET (removing and adding RET);

else do;

sum_Ret +- k[ind - 1];

if ind <= (dim(k) - &days + 1) then sum_Ret ++ k[ind + &days - 1];

output;

end;

end;

call missing(of k

drop ind;

run;

proc print data = need;

run;

Hope this works for you.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-28-2014 05:49 PM

I notice that your dataset only includes weekdays. So a 30 day interval will include less than 30 datapoints. It is not clear how you want to handle that.

Otherwise, the only thing being *wrong* is your dataset is its size!

If you have access to SAS/ETS, you should try **proc expand** to get what you want:

**/* Simple sum of returns */**

**proc expand data=sasforum.test1 out=test1;**

**by permno notsorted;**

**id date;**

**convert ret=ret_30 / transformout=(reverse movsum 30 reverse);**

**run;**

**/* Compounded product of returns */**

**proc expand data=sasforum.test1 out=test2;**

**by permno notsorted;**

**id date;**

**convert ret=compo_ret_30 / transformout=(+1 log reverse movsum 30 reverse exp -1);**

**run;**

it should be a lot faster than SQL.

(Tested with your example dataset)

PG

PG