BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AbuChowdhury
Fluorite | Level 6

Hi Experts,

I have daily returns data for firms. I would like to calculate the standard deviation of the returns (say, sigma) from the daily observations for the last 3 months on a rolling basis. For example, sigma for for firm i in december 2017 will be the standard deviations of the daily returns from september to november 2017. Similarly, sigma for firm i in november 2017 will be the standard deviations of the daily returns from august to october 2017.

Thanks,

Abu

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

OK. If you want retain those multiple returns in the same PERMNO and the same Names_Date.

 

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h(multidata:'y');
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();
 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);
     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;
 std=std(of x{*});
 output;
end;

h.clear();
drop i _ret n rc;
run;

View solution in original post

21 REPLIES 21
Reeza
Super User

Can you post some sample data? Have you tried anything so far as well? PROC EXPAND is probably where I'd start, but I'd also consider changing your definition. If you use the previous 90 days for example or X trading days. 3 months is not a standard measure and you'll see blips because of February issue alone.

 

 

Ksharp
Super User

proc sql;
create table want as
 select *,(select std(return) from have where id=a.id and date between intnx('month',a.date,-3) and a.date ) as std
   from have as a;
quit;

AbuChowdhury
Fluorite | Level 6

Thank you Xia. But proc sql takes very long time if there are millions of observations. Is there any other way such as using array or using macro?

PaigeMiller
Diamond | Level 26

Yes, of course calculations of this nature are going to take a very long time if you have millions of observations. If speed doing the calculation on millions of observations is a concern, you should have stated this at the beginning.

 

Please read what Reeza said and show us some sample data, and show us the desired results.

 

 

--
Paige Miller
AbuChowdhury
Fluorite | Level 6

Ksharp's (Xia) code is ok but I would like to know if it's possible to do using macro or using array. I attach the sample dataset. Test file is attached since sas data file cannot be attached.

PaigeMiller
Diamond | Level 26

Yes, of course it would be possible to write this as a macro. I don't know if it would be faster on millions of observations because I never tried. As @Reeza has stated, PROC EXPAND might also be a good solution.

 

Then, there is this solution, which works for regression coefficients with PROC REG, but something similar using PROC SUMMARY ought to work for standard deviations https://communities.sas.com/t5/SAS-Procedures/How-to-capture-the-regression-coeffecient-for-certain-...

(and this could be customized to handle the different number of days in each month)

--
Paige Miller
Reeza
Super User

If it was a fixed interval a different approach would be easier but since the number of changes each month depending on the month that's the biggest problem. 

 

You could probably expand the temporary array approach used here, but PROC EXPAND is still your best option. Have you looked into that at all? It would likely also be the fastest option as SAS would have optimized it already.

 

http://support.sas.com/kb/41/380.html

 

You would need to set the array size to the largest possible number of records, probably 93 to be safe. 

Ksharp
Super User

I think Hash Table could do that. Post some data ,maybe I could give a try.

AbuChowdhury
Fluorite | Level 6

Hi Xia,

Sample data is attached in text format. Please try with hash object.

Ksharp
Super User

OK. Try this one. Notice: each PERMNO and NAMES_DATE have only one Return.

 

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h();
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();
 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   if h.find(key:i)=0 then do;n+1;x{n}=_ret; end;
 end;
 std=std(of x{*});
 output;
end;

h.clear();
drop i _ret n;
run;
AbuChowdhury
Fluorite | Level 6

It's lot faster now. It also shows the result i.e. the standard deviation. But it shows the following error many times:

ERROR: Duplicate key.

 

Ksharp
Super User

That means there are multiply return for the same PERMNO and NAMES_DATE .

Make sure a PERMNO and a NAMES_DATE have only one return.

To sum(return) in each PERMNO and  NAMES_DATE combination.

After that running code would not generate that error information.

 

proc summary data=have;

by PERMNO NAMES_DATE  ;

var return;

output out=want sum=;

run;

 

OR you want the last return for each group of PERMNO and  NAMES_DATE.

data want;

 set have;

by PERMNO NAMES_DATE  ;

if last.NAMES_DATE  ;

run;

 

Ksharp
Super User

OK. If you want retain those multiple returns in the same PERMNO and the same Names_Date.

 

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h(multidata:'y');
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();
 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);
     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;
 std=std(of x{*});
 output;
end;

h.clear();
drop i _ret n rc;
run;
AbuChowdhury
Fluorite | Level 6

Thanks a lot Xia. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 21 replies
  • 5551 views
  • 2 likes
  • 6 in conversation