Hi,
I have a very large dataset, which has variable "date","x","y". there're multiple "x" - "y" pairs for a date. I need to calculate the rolling covariance of "x" and "y" for a given "date" using ("date"-21,"date") window, requiring at least four "x"-"y" pairs for the calculation and at least one "x"-"y" pairs on that "date".
How can I calculate this?
Some posts suggest using proc timeseries with outcrosscorr option. But it seems the base data set has only one "x"-"y" pair for a "date". I need to use all "x"-"y" pairs to do the calculation.
Thank you very much!
This is a good situation for someone to search the SAS communities for "rolling covariance" or "rolling correlations" as questions about this come up once in a while. For example, here's one such thread: https://communities.sas.com/t5/SAS-Programming/Rolling-pairwise-correlations/m-p/325493#M72390
Thanks for pointing out the post!
However, the situation is for my data, one "date" has multiple observations and I'll need to use all these observations to calculate covariance. So it'll be difficult for me to specify the rolling window in proc expand, which works on datasets that a date only has one observation.
@jl2978 wrote:
Thanks for pointing out the post!
However, the situation is for my data, one "date" has multiple observations and I'll need to use all these observations to calculate covariance. So it'll be difficult for me to specify the rolling window in proc expand, which works on datasets that a date only has one observation.
I'm not clear on how you would use multiple observations on a single date in a rolling window covariance, I don't think you have explained the details. But whatever the method, you would have to program that, and since the link I gave also pointed to some DATA step code, its not unreasonable for you to modify the data step code to handle this.
Thanks for your reply!
The sample data is below. I'll need to calculate covariance of r and r_lag for each cusip_id - trd_exctn_dt pair for a 21 days rolling window (trd_exctn_dt-21,trd_exctn_dt).
Obs cusip_id trd_exctn_dt trd_exctn_tm r r_lag
1 000361AH8 20120214 14:02:44 -0.010491 0.008998
2 000361AH8 20120214 14:25:00 0.000964 -0.010491
3 000361AH8 20120214 16:11:01 -0.000790 0.000964
4 000361AH8 20120215 13:09:51 -0.006197 -0.000790
5 000361AH8 20120216 16:00:24 -0.008846 -0.006197
6 000361AH8 20120217 15:41:34 0.024823 -0.008846
7 000361AH8 20120217 15:44:27 0.001146 0.024823
8 000361AH8 20120217 16:03:22 -0.010782 0.001146
9 000361AH8 20120222 12:15:41 -0.002307 -0.010782
10 000361AH8 20120222 12:28:00 0.004831 -0.002307
11 000361AH8 20120222 12:28:09 0.004808 0.004831
12 000361AH8 20120222 12:30:25 -0.009091 0.004808
13 000361AH8 20120223 14:56:17 0.004346 -0.009091
14 000361AH8 20120223 15:11:36 -0.003606 0.004346
15 000361AH8 20120223 15:22:48 0.001206 -0.003606
16 000361AH8 20120227 13:34:41 -0.026506 0.001206
17 000361AH8 20120227 13:34:41 0.009901 -0.026506
18 000361AH8 20120228 14:51:56 0.002451 0.009901
19 000361AH8 20120302 11:33:39 0.001222 0.002451
20 000361AH8 20120308 12:59:49 -0.015385 0.001222
21 000361AH8 20120308 12:59:49 0.001984 -0.015385
22 000361AH8 20120319 9:59:55 0.016767 0.001984
23 000361AH8 20120319 14:12:49 -0.001816 0.016767
24 000361AH8 20120321 13:18:43 -0.022584 -0.001816
25 000361AH8 20120321 14:14:58 -0.000669 -0.022584
26 000361AH8 20120321 14:15:08 -0.000936 -0.000669
27 000361AH8 20120321 14:32:19 0.009684 -0.000936
28 000361AH8 20120321 14:34:24 0.001188 0.009684
29 000361AH8 20120321 15:38:36 -0.013548 0.001188
30 000361AH8 20120321 15:44:21 0.003759 -0.013548
31 000361AH8 20120321 15:46:45 0.002497 0.003759
32 000361AH8 20120321 15:46:57 -0.001245 0.002497
33 000361AH8 20120321 15:47:09 0.001247 -0.001245
34 000361AH8 20120322 10:16:51 -0.003478 0.001247
35 000361AH8 20120322 14:25:47 -0.003628 -0.003478
36 000361AH8 20120323 11:17:27 0.000873 -0.003628
37 000361AH8 20120323 15:29:00 -0.003539 0.000873
38 000361AH8 20120323 15:50:00 -0.002123 -0.003539
39 000361AH8 20120323 15:51:00 0.002894 -0.002123
40 000361AH8 20120323 16:09:32 -0.000985 0.002894
41 000361AH8 20120323 16:11:34 -0.001258 -0.000985
42 000361AH8 20120323 16:25:02 0.002519 -0.001258
43 000361AH8 20120323 16:44:00 -0.002010 0.002519
44 000361AH8 20120327 13:59:25 -0.003651 -0.002010
45 000361AH8 20120327 13:59:31 -0.001263 -0.003651
46 000361AH8 20120328 15:44:19 0.000633 -0.001263
the output data will look like
Obs cusip_id trd_exctn_dt cov
1 000361AH8 20120214 X
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.