Solved: Standard deviation on rolling basis

AbuChowdhury · Posted 03-17-2018 04:20 PM

Hi Experts,

I have daily returns data for firms. I would like to calculate the standard deviation of the returns (say, sigma) from the daily observations for the last 3 months on a rolling basis. For example, sigma for for firm i in december 2017 will be the standard deviations of the daily returns from september to november 2017. Similarly, sigma for firm i in november 2017 will be the standard deviations of the daily returns from august to october 2017.

Thanks,

Abu

Ksharp · Posted 03-21-2018 08:57 AM

OK. If you want retain those multiple returns in the same PERMNO and the same Names_Date.

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h(multidata:'y');
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();
 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);
     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;
 std=std(of x{*});
 output;
end;

h.clear();
drop i _ret n rc;
run;

View solution in original post

Reeza · Posted 03-17-2018 06:22 PM

Can you post some sample data? Have you tried anything so far as well? PROC EXPAND is probably where I'd start, but I'd also consider changing your definition. If you use the previous 90 days for example or X trading days. 3 months is not a standard measure and you'll see blips because of February issue alone.

Ksharp · Posted 03-18-2018 06:29 AM


proc sql;
create table want as
 select *,(select std(return) from have where id=a.id and date between intnx('month',a.date,-3) and a.date ) as std
   from have as a;
quit;

AbuChowdhury · Posted 03-18-2018 10:38 AM

Thank you Xia. But proc sql takes very long time if there are millions of observations. Is there any other way such as using array or using macro?

PaigeMiller · Posted 03-18-2018 12:15 PM

Yes, of course calculations of this nature are going to take a very long time if you have millions of observations. If speed doing the calculation on millions of observations is a concern, you should have stated this at the beginning.

Please read what Reeza said and show us some sample data, and show us the desired results.

--
Paige Miller

AbuChowdhury · Posted 03-18-2018 01:36 PM

Ksharp's (Xia) code is ok but I would like to know if it's possible to do using macro or using array. I attach the sample dataset. Test file is attached since sas data file cannot be attached.

PaigeMiller · Posted 03-18-2018 04:05 PM

Yes, of course it would be possible to write this as a macro. I don't know if it would be faster on millions of observations because I never tried. As @Reeza has stated, PROC EXPAND might also be a good solution.

Then, there is this solution, which works for regression coefficients with PROC REG, but something similar using PROC SUMMARY ought to work for standard deviations https://communities.sas.com/t5/SAS-Procedures/How-to-capture-the-regression-coeffecient-for-certain-...

(and this could be customized to handle the different number of days in each month)

--
Paige Miller

Reeza · Posted 03-18-2018 04:13 PM

If it was a fixed interval a different approach would be easier but since the number of changes each month depending on the month that's the biggest problem.

You could probably expand the temporary array approach used here, but PROC EXPAND is still your best option. Have you looked into that at all? It would likely also be the fastest option as SAS would have optimized it already.

http://support.sas.com/kb/41/380.html

You would need to set the array size to the largest possible number of records, probably 93 to be safe.

Ksharp · Posted 03-19-2018 10:36 AM

I think Hash Table could do that. Post some data ,maybe I could give a try.

AbuChowdhury · Posted 03-19-2018 12:51 PM

Hi Xia,

Sample data is attached in text format. Please try with hash object.

Ksharp · Posted 03-20-2018 09:38 AM

OK. Try this one. Notice: each PERMNO and NAMES_DATE have only one Return.

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h();
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();
 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   if h.find(key:i)=0 then do;n+1;x{n}=_ret; end;
 end;
 std=std(of x{*});
 output;
end;

h.clear();
drop i _ret n;
run;

AbuChowdhury · Posted 03-20-2018 02:27 PM

It's lot faster now. It also shows the result i.e. the standard deviation. But it shows the following error many times:

ERROR: Duplicate key.

Ksharp · Posted 03-21-2018 08:16 AM

That means there are multiply return for the same PERMNO and NAMES_DATE .

Make sure a PERMNO and a NAMES_DATE have only one return.

To sum(return) in each PERMNO and NAMES_DATE combination.

After that running code would not generate that error information.

proc summary data=have;

by PERMNO NAMES_DATE ;

var return;

output out=want sum=;

run;

OR you want the last return for each group of PERMNO and NAMES_DATE.

data want;

set have;

by PERMNO NAMES_DATE ;

if last.NAMES_DATE ;

run;

Ksharp · Posted 03-21-2018 08:57 AM

OK. If you want retain those multiple returns in the same PERMNO and the same Names_Date.

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h(multidata:'y');
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();
 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);
     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;
 std=std(of x{*});
 output;
end;

h.clear();
drop i _ret n rc;
run;

AbuChowdhury · Posted 03-21-2018 12:21 PM

Thanks a lot Xia.

Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Re: Standard deviation on rolling basis

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away