BookmarkSubscribeRSS Feed
Nieves
Quartz | Level 8

Hi,

 

May I ask how to calculate: the standard deviation of each firm’s daily stock return over the past three months (SIGMA), using daily stock price data from CRSP ?

 

If there are fewer than five nonzero observations over the 3 months used in the rolling window computation, SIGMA will be counted as missing and replace with the cross-sectional mean of SIGMA. 

 

Thanks

 

 

13 REPLIES 13
s_lassen
Meteorite | Level 14

The easiest way to do this is probably storing the returns in an array, something like:

Data want;
  set have;
  by Stock_code:
  array returns(0:90) 8 _temporary_; /* 3 months is about 91 days */
  if first.Stock_code then 
    call missing(of returns(*));
  returns(mod(_N_,91))=daily_return;
  if n(returns)>1 then
    st_dev=std(of returns(*));
run;

Note the use of a zero-based array, it makes the calculation of the array index a simple modula of _N_.

Nieves
Quartz | Level 8

 

Thank you very much for your reply. If there is an additional requirement that if there are fewer than five nonzero observations over the 3 months used in the rolling window computation, SIGMA will be counted as missing and replace with the cross-sectional mean of SIGMA. May I ask how to add that in your following code to implement ? 

Nieves
Quartz | Level 8

Hi @Ksharp

 

Thank you very much for your kind reply in another thread on calculating rolling standard deviation. This is my new thread. I have built from your prior code. May I seek your advice on whether it is correct ? 

 

proc sql;
create table want as
 select *,
case when count(return) <5 then .
else
(select std(return) from have where id=a.id and date between intnx('month',a.date,-3) and a.date )
end as std
   from have as a;
quit;

proc stdize data=have reponly method=mean out=want;

var std;
run;

 

 

 By the way, it seems that proc sql runs slowly, may I seek your advice on how to add the requirement of coding std as missing and replacing as cross section mean of std, if there is less than five nonzero observations over the prior 3 months in the rolling window ? 

using the following code you have provided in the another thread ? Thanks!!

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h(multidata:'y');
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();
 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);
     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;
 std=std(of x{*});
 output;
end;

h.clear();
drop i _ret n rc;
run;

 

 

Ksharp
Super User

Yeah. Hash Table is faster. You could use the following code to see if count(return)<5.

If you create INDEX for SQL, then you would get faster.

 


count=0;  /************/

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);

     if not missing(_RET) then count+1;   /*************/

     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;

 std=std(of x{*});  /* <-- Change it into your Cross STD*/

if count<5 then std=.; /***********/

 output;
end;

h.clear();

 

Nieves
Quartz | Level 8

Hi @Ksharp

 

Thank you very much for your reply. It works! 

 

However, for each permno, how shall I require it has at least three months of data to calculate the standard deviation ? Currently, for each permno, the std for the second month is calculated from the first month. 

 

Current results: 

permno  month std

10000       1       .

10000       2       0.5

10000       3       0.6

10000       4       0.7

20000       1       .

20000       2       0.3

20000       3       0.4

20000       4       0.5

 

So if I require the initial three month of data for the calculation of standard deviation, how can I get the results as follows: 

Want results: 

permno  month std

10000       1       .

10000       2       .

10000       3       .

10000       4       0.7

20000       1        .

20000       2        .

20000       3        .

20000       4       0.5

 

Thanks 

 

 

Ksharp
Super User

Check the number of array X ,and if it >3 then calculated your std.

 

if n(of x{*})>3 then do;  /*  your Cross STD*/ end;
Nieves
Quartz | Level 8

Thank you very much @Ksharp

May I seek your advice on where I shall add this line ? Is it correct as follows: 

count=0;  /************/

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);

     if not missing(_RET) then count+1;   /*************/

     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;
if n(of x{*})>3 then do; 
std=std(of x{*}); /* <-- Change it into your Cross STD*/ 
if count<5 then std=.; /***********/
output;
end;
h.clear();

 

Ksharp
Super User

Oh. At this time , you can get rid of COUNT variable. just add a condition before your std.

 

 



do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);

     if not missing(_RET) then count+1;   /*************/

     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;

if
 n(of x{*})>3 then do;
/*****std=std(of x{*}); <-- here is your Cross STD*/
end;

 

 

 

And delete this statement:

 if count<5 then std=.;

Ksharp
Super User

Oh. At this time , you can get rid of COUNT variable. just add a condition before your std.

 

 



do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);

     if not missing(_RET) then count+1;   /*************/

     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;

if
 n(of x{*})>3 then do;
/*****std=std(of x{*}); <-- here is your Cross STD*/
end;

 

 

 

And delete this statement:

 if count<5 then std=.;

Nieves
Quartz | Level 8

Hi @Ksharp, thanks for confirmation and sorry about keeping asking you this question. 

 

I have two requirements for the past three months data: 

1.  replace the current month std as missing  if there are less than 5 observations in the past three months 

2. replace the current month std as missing if there is any month with all missing data in the past three months 

 

If in the data, for one permno, there is data for past three months, but each month only have one observation, then this case does not fit the requirement of at least 5 observations (requirement 1) but fit the requirement of nonmissing data for any month among the past three months data (requirement 2). In order to take care of this extreme case, can I add both count and 

n(of x{*})>3 ?

By the way, does 

n(of x{*})>3 ? mean at least three observations or at least three months of data ? Since the requirement 2 requires at least three months of data instead of at least 3 observations

 

Is my code correct if combining both requirements as follows ? Thanks again! 

count=0;  /************/

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);

     if not missing(_RET) then count+1;   /*************/

     n+1;x{n}=_ret;
     rc=h.find_next(key:i);
   end;
 end;
if n(of x{*})>3 then do;
std=std(of x{*}); /* <-- Change it into your Cross STD*/ if count<5 then std=.; /***********

 

Ksharp
Super User

" then this case does not fit the requirement of at least 5 observations (requirement 1) but fit the requirement of nonmissing data for any month among the past three months data (requirement 2)"

 

Oh. Maybe I understood your question. You want set STD be missing when anyone of month has all missing value , right ?

 

Check the following code. and if it works.

 

 

proc import datafile='c:\temp\daily_ret.txt' out=have dbms=tab replace;
guessingrows=32767;
run;
data have ;
 set have;
 ret=input(Returns,?? best32.);
 drop Returns;
 monyy=intnx('month',Names_Date,0);
 format monyy yymmn6.;
run;
proc sort data=have;by PERMNO monyy Names_Date;run;

data want;
 if _n_=1 then do;
   if 0 then set have(rename=(ret=_ret));
   declare hash h(multidata:'y');
   h.definekey('Names_Date');
   h.definedata('_ret');
   h.definedone();

   /**********/
   declare hash m();
   m.definekey('_monyy');
   m.definedata('_ret');
   m.definedone();
   /**********/

 end;

array x{100} _temporary_;
do until(last.PERMNO);
 set have;
 by PERMNO;
 _ret=ret;h.add();
end;

do until(last.PERMNO);
 set have;
 by PERMNO;
 n=0;call missing(of x{*});
 m.clear();  /**<---***/
 do i=intnx('month',monyy,-3) to intnx('month',monyy,-1,'e');
   rc=h.find(key:i);
   do while(rc=0);
     if not missing(_RET) then do;n+1;x{n}=_ret;  _monyy=intnx('month',i,0); m.replace(); end; /**<---***/
     rc=h.find_next(key:i);
   end;
 end;
 std=std(of x{*});
 if n<5 then std=.; /**<-----****/
 if m.num_items<3 then   std=.;	 /**<----EDIT HERE-****/
 output;
end;

h.clear();
drop i _ret n rc;
run; 

 

Nieves
Quartz | Level 8

Thank you! It works 🙂 @Ksharp

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 5697 views
  • 3 likes
  • 4 in conversation