Hi,
May I ask how to calculate: the standard deviation of each firm’s daily stock return over the past three months (SIGMA), using daily stock price data from CRSP ?
If there are fewer than five nonzero observations over the 3 months used in the rolling window computation, SIGMA will be counted as missing and replace with the cross-sectional mean of SIGMA.
Thanks
Have you looked around in the community, there are a lot of similar question already that have great answers like:
If you haven't loaded the data into SAS then take a look at http://support.sas.com/documentation/cdl/en/etsug/66840/HTML/default/viewer.htm#etsug_sasexccm_toc.h...
The easiest way to do this is probably storing the returns in an array, something like:
Data want;
set have;
by Stock_code:
array returns(0:90) 8 _temporary_; /* 3 months is about 91 days */
if first.Stock_code then
call missing(of returns(*));
returns(mod(_N_,91))=daily_return;
if n(returns)>1 then
st_dev=std(of returns(*));
run;
Note the use of a zero-based array, it makes the calculation of the array index a simple modula of _N_.
Hi s_lassen,
Thank you very much for your reply. If there is an additional requirement that if there are fewer than five nonzero observations over the 3 months used in the rolling window computation, SIGMA will be counted as missing and replace with the cross-sectional mean of SIGMA. May I ask how to add that in your following code to implement ?
Hi @Ksharp
Thank you very much for your kind reply in another thread on calculating rolling standard deviation. This is my new thread. I have built from your prior code. May I seek your advice on whether it is correct ?
proc sql;
create table want as
select *,
case when count(return) <5 then .
else
(select std(return) from have where id=a.id and date between intnx('month',a.date,-3) and a.date )
end as std
from have as a;
quit;
proc stdize data=have reponly method=mean out=want;
var std;
run;
By the way, it seems that proc sql runs slowly, may I seek your advice on how to add the requirement of coding std as missing and replacing as cross section mean of std, if there is less than five nonzero observations over the prior 3 months in the rolling window ?
using the following code you have provided in the another thread ? Thanks!!
proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
set have;
ret=input(Returns,?? best32.);
drop Returns;
monyy=intnx('month',Names_Date,0);
format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;
data want;
if _n_=1 then do;
if 0 then set have(rename=(ret=_ret));
declare hash h(multidata:'y');
h.definekey('Names_Date');
h.definedata('_ret');
h.definedone();
end;
array x{100} _temporary_;
do until(last.PERMNO);
set have;
by PERMNO;
_ret=ret;h.add();
end;
do until(last.PERMNO);
set have;
by PERMNO;
n=0;call missing(of x{*});
do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
rc=h.find(key:i);
do while(rc=0);
n+1;x{n}=_ret;
rc=h.find_next(key:i);
end;
end;
std=std(of x{*});
output;
end;
h.clear();
drop i _ret n rc;
run;
Yeah. Hash Table is faster. You could use the following code to see if count(return)<5.
If you create INDEX for SQL, then you would get faster.
count=0; /************/
do until(last.PERMNO);
set have;
by PERMNO;
n=0;call missing(of x{*});
do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
rc=h.find(key:i);
do while(rc=0);
if not missing(_RET) then count+1; /*************/
n+1;x{n}=_ret;
rc=h.find_next(key:i);
end;
end;
std=std(of x{*}); /* <-- Change it into your Cross STD*/
if count<5 then std=.; /***********/
output;
end;
h.clear();
Hi @Ksharp
Thank you very much for your reply. It works!
However, for each permno, how shall I require it has at least three months of data to calculate the standard deviation ? Currently, for each permno, the std for the second month is calculated from the first month.
Current results:
permno month std
10000 1 .
10000 2 0.5
10000 3 0.6
10000 4 0.7
20000 1 .
20000 2 0.3
20000 3 0.4
20000 4 0.5
So if I require the initial three month of data for the calculation of standard deviation, how can I get the results as follows:
Want results:
permno month std
10000 1 .
10000 2 .
10000 3 .
10000 4 0.7
20000 1 .
20000 2 .
20000 3 .
20000 4 0.5
Thanks
Check the number of array X ,and if it >3 then calculated your std.
if n(of x{*})>3 then do; /* your Cross STD*/ end;
Thank you very much @Ksharp
May I seek your advice on where I shall add this line ? Is it correct as follows:
count=0; /************/
do until(last.PERMNO);
set have;
by PERMNO;
n=0;call missing(of x{*});
do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
rc=h.find(key:i);
do while(rc=0);
if not missing(_RET) then count+1; /*************/
n+1;x{n}=_ret;
rc=h.find_next(key:i);
end;
end;
if n(of x{*})>3 then do;
std=std(of x{*}); /* <-- Change it into your Cross STD*/
if count<5 then std=.; /***********/
output;
end;
h.clear();
Oh. At this time , you can get rid of COUNT variable. just add a condition before your std.
do until(last.PERMNO);
set have;
by PERMNO;
n=0;call missing(of x{*});
do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
rc=h.find(key:i);
do while(rc=0);
if not missing(_RET) then count+1; /*************/
n+1;x{n}=_ret;
rc=h.find_next(key:i);
end;
end;
if n(of x{*})>3 then do;
/*****std=std(of x{*}); <-- here is your Cross STD*/
end;
And delete this statement:
if count<5 then std=.;
Oh. At this time , you can get rid of COUNT variable. just add a condition before your std.
do until(last.PERMNO);
set have;
by PERMNO;
n=0;call missing(of x{*});
do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
rc=h.find(key:i);
do while(rc=0);
if not missing(_RET) then count+1; /*************/
n+1;x{n}=_ret;
rc=h.find_next(key:i);
end;
end;
if n(of x{*})>3 then do;
/*****std=std(of x{*}); <-- here is your Cross STD*/
end;
And delete this statement:
if count<5 then std=.;
Hi @Ksharp, thanks for confirmation and sorry about keeping asking you this question.
I have two requirements for the past three months data:
1. replace the current month std as missing if there are less than 5 observations in the past three months
2. replace the current month std as missing if there is any month with all missing data in the past three months
If in the data, for one permno, there is data for past three months, but each month only have one observation, then this case does not fit the requirement of at least 5 observations (requirement 1) but fit the requirement of nonmissing data for any month among the past three months data (requirement 2). In order to take care of this extreme case, can I add both count and
n(of x{*})>3 ?
By the way, does
n(of x{*})>3 ? mean at least three observations or at least three months of data ? Since the requirement 2 requires at least three months of data instead of at least 3 observations
Is my code correct if combining both requirements as follows ? Thanks again!
count=0; /************/
do until(last.PERMNO);
set have;
by PERMNO;
n=0;call missing(of x{*});
do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
rc=h.find(key:i);
do while(rc=0);
if not missing(_RET) then count+1; /*************/
n+1;x{n}=_ret;
rc=h.find_next(key:i);
end;
end;
if n(of x{*})>3 then do;
std=std(of x{*}); /* <-- Change it into your Cross STD*/ if count<5 then std=.; /***********
" then this case does not fit the requirement of at least 5 observations (requirement 1) but fit the requirement of nonmissing data for any month among the past three months data (requirement 2)"
Oh. Maybe I understood your question. You want set STD be missing when anyone of month has all missing value , right ?
Check the following code. and if it works.
proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
set have;
ret=input(Returns,?? best32.);
drop Returns;
monyy=intnx('month',Names_Date,0);
format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;
data want;
if _n_=1 then do;
if 0 then set have(rename=(ret=_ret));
declare hash h(multidata:'y');
h.definekey('Names_Date');
h.definedata('_ret');
h.definedone();
/**********/
declare hash m();
m.definekey('_monyy');
m.definedata('_ret');
m.definedone();
/**********/
end;
array x{100} _temporary_;
do until(last.PERMNO);
set have;
by PERMNO;
_ret=ret;h.add();
end;
do until(last.PERMNO);
set have;
by PERMNO;
n=0;call missing(of x{*});
m.clear(); /**<---***/
do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
rc=h.find(key:i);
do while(rc=0);
if not missing(_RET) then do;n+1;x{n}=_ret; _monyy=intnx('month',i,0); m.replace(); end; /**<---***/
rc=h.find_next(key:i);
end;
end;
std=std(of x{*});
if n<5 then std=.; /**<-----****/
if m.num_items<3 then std=.; /**<----EDIT HERE-****/
output;
end;
h.clear();
drop i _ret n rc;
run;
Thank you! It works 🙂 @Ksharp
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.