BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JKCho
Pyrite | Level 9

 

data WANT; 
   input YEAR $ return $ market return;
   datalines;                     
1990 0.11 0.05
1991 0.02 0.02
1992 0.03 0.05
1993 0.04 0.03
1994 0.05 0.08
1995 0.06 0.09
1996 0.07 .
1997 0.08 0.09
1998 0.07 0.07
1999 0.14 0.20 
2000 0.10 0.11

The above is a simple example of my dataset.

I am trying to calculate moving covariance(between return and market return) and moving variance of market returns.

As you may guess, I at the end want to have beta but the beta I need to have is not a coefficient from regression. The paper I follow did this way so I need to calculate cov(Rm, Ri) and Var(Rm).

 

I first thought to utilize RPOC EXPAND and yes, this at least gives me the result of variances by obtaining moving standard deviations from MOVSTD and I square them. Nonetheless, there is no command calculating COVARIANCE, quite sadly. 

 

Do anyone of you know how to calculate covariances given with the data like the above? Also, how can I deal with missing values when I use your own codes?

 

Moving covariance is tricky and what I have found so far is moving covariance for time-series only.

 

I appreciate your helps a lot!!!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
JKCho
Pyrite | Level 9

Hi Paige,

 

I did found the paper you suggested once but did not think to revise the codes for my case. 

I did revise some and share this with other users who may face similar difficulties.

 

proc expand DATA = result OUT = result2;
id date;
 convert VWRETD = YISUM / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert ret = XISUM / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert prodxiyi = PRODXIYISUM / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert obs = N / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert ret = XICSS / METHOD = none TRANSFORMOUT = (movcss 60 trimleft 36);
 convert VWRETD = YICSS / METHOD = none TRANSFORMOUT = (movcss 60 trimleft 36);
 convert VWRETD = STDrm / METHOD = none TRANSFORMOUT = (movstd 60 trimleft 36);
 convert ret = STDret / METHOD = none TRANSFORMOUT = (movstd 60 trimleft 36);
by cusip;
run;
data result3; set result2;
 R = (prodxiyisum - (yisum*xisum)/n) / ( sqrt(xicss)*sqrt(yicss)) ;
 cov=R*STDrm*STDret;
 var=(STDrm)**2;
 beta=cov/var;
 yyyy=year(date);
 if n="." then delete;
 if 1990<=yyyy<=2019;
run;

View solution in original post

10 REPLIES 10
PeterClemmensen
Tourmaline | Level 20

What do you mean by 'deal with' missing values?

 

Also, do you have multiple groups/IDs in your data, and should they be handled as well?

yabwon
Amethyst | Level 16

For the data you provided I think that something like below may be an option:

data HAVE; 
   input YEAR $ return market_return;
datalines;                     
1990 0.11 0.05
1991 0.02 0.02
1992 0.03 0.05
1993 0.04 0.03
1994 0.05 0.08
1995 0.06 0.09
1996 0.07 .
1997 0.08 0.09
1998 0.07 0.07
1999 0.14 0.20 
2000 0.10 0.11
;
run;

%let window = 3; /* one less than what you need */

data want;
  if &window.<2 then stop; /* prevent to short window */
  set have curobs=curobs;
  /* in case you need it in groups uncomment below 2 lines */
  /*
    by SOME_GROUPING_VARIABLE;
    if first.SOME_GROUPING_VARIABLE then call missing(of MR[*], of R[*]);
  */

  array R[0:&window.]  _temporary_;
  array MR[0:&window.] _temporary_;

  R[mod(curobs,&window)] = return; 
  MR[mod(curobs,&window)] = market_return;

  /* variance */
  if nmiss(of MR[*]) < &window. then MRvariance = var(of MR[*]);

  /* covariance */
  if nmiss(of MR[*]) < &window. and nmiss(of R[*]) < &window. then
    do;
      MRavg = mean(of MR[*]);
       Ravg = mean(of  R[*]);

       s = 0; n = 0;
       do i = 0 to &window.;
         if not nmiss(MR[i],R[i]) then
           do;
             s + (MR[i]-MRavg)*(R[i]-Ravg);
             n + 1;
           end;

         if n<2 then MR_R_covariance =.;
                else MR_R_covariance = s/(n-1);
       end;
    end;
run;
proc print data = want;
run;

 

all the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



PaigeMiller
Diamond | Level 26

In your earlier thread about moving averages, I pointed to https://support.sas.com/kb/25/027.html which allows you to compute moving averages in a data set. The same logic allows you to compute moving covariances (except instead of the part where you compute a moving average, you have to program it to compute the moving covariance).

--
Paige Miller
JKCho
Pyrite | Level 9
As I said,

I first thought to utilize RPOC EXPAND and yes, this at least gives me the result of variances by obtaining moving standard deviations from MOVSTD and I square them. Nonetheless, there is no command calculating COVARIANCE, quite sadly.

I was asking about obtaining moving covariance, not movage. You know it is not the same when there is no available PROC EXPAND command that is why I had to make this thread. clear?
PaigeMiller
Diamond | Level 26

You can look up the formula for covariance, and then replace the calculation of the mean in the DATA step example with the calculation of a covariance.

--
Paige Miller
PaigeMiller
Diamond | Level 26

There's also Example 2 in this paper which produces a moving correlation coefficient with PROC EXPAND; if you can calculate moving correlation then you can use the same ideas to calculate a moving covariance.

--
Paige Miller
JKCho
Pyrite | Level 9

Hi Paige,

 

I did found the paper you suggested once but did not think to revise the codes for my case. 

I did revise some and share this with other users who may face similar difficulties.

 

proc expand DATA = result OUT = result2;
id date;
 convert VWRETD = YISUM / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert ret = XISUM / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert prodxiyi = PRODXIYISUM / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert obs = N / METHOD = none TRANSFORMOUT = (movsum 60 trimleft 36);
 convert ret = XICSS / METHOD = none TRANSFORMOUT = (movcss 60 trimleft 36);
 convert VWRETD = YICSS / METHOD = none TRANSFORMOUT = (movcss 60 trimleft 36);
 convert VWRETD = STDrm / METHOD = none TRANSFORMOUT = (movstd 60 trimleft 36);
 convert ret = STDret / METHOD = none TRANSFORMOUT = (movstd 60 trimleft 36);
by cusip;
run;
data result3; set result2;
 R = (prodxiyisum - (yisum*xisum)/n) / ( sqrt(xicss)*sqrt(yicss)) ;
 cov=R*STDrm*STDret;
 var=(STDrm)**2;
 beta=cov/var;
 yyyy=year(date);
 if n="." then delete;
 if 1990<=yyyy<=2019;
run;
JKCho
Pyrite | Level 9
Thank you for bringing the paper back to me!
mkeintz
PROC Star

It's true theatproc expand can only generate moving statistics on univariate series, which excludes moving covariances.

 

BUT ... you can 

  1. create a 3rd variable   (product=return*market_return) in a new data set
  2. Use proc expand on that new dataset to get the moving sums of return, market_return, and product
  3. Take advantage of the fact that covariance can be calculated from those sums using the bottom line in this image
          image.png

In other words, for a windows of size 6  you would have   

    one sixth of the sum of product,

    minus 

    one thirty-sixth of the sum of return times the sum of market_return.

 

As to missing values, if they are rare, I'd just have 5 data points in my 6-year windows  (i.e. in your data, all the windows that normally would include 1996).  Just be sure to drop ALL the variables for 1996, not just market_return.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
JKCho
Pyrite | Level 9
Hello mkeintz,

Yeah, I did a similar approach you suggested. There is no covariance so I did calculate correlation and multiplied it by the standard deviation of returns and that of market returns.

As to missing values, dropping missing values are what I usually do but if I do so, this is somewhat against a pure definition of what I want to obtain, 24-month period covariance or so. What I have thought is to calculate standard deviations of 24 observations or less if there are missing values while the windows are still 24-month. SO... yeah, same. I just leave missing values since they are rare... like you said: I'd just have 5 data points in my 6-year windows.

Thank you for your tip and stay safe!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3206 views
  • 7 likes
  • 5 in conversation