Hi,
I have a very large dataset, which has variable "date","x","y". there're multiple "x" - "y" pairs for a date. I need to calculate the rolling covariance of "x" and "y" for a given "date" using ("date"-21,"date") window, requiring at least four "x"-"y" pairs for the calculation and at least one "x"-"y" pairs on that "date".
How can I calculate this?
Some posts suggest using proc timeseries with outcrosscorr option. But it seems the base data set has only one "x"-"y" pair for a "date". I need to use all "x"-"y" pairs to do the calculation.
Thank you very much!
This is a good situation for someone to search the SAS communities for "rolling covariance" or "rolling correlations" as questions about this come up once in a while. For example, here's one such thread: https://communities.sas.com/t5/SAS-Programming/Rolling-pairwise-correlations/m-p/325493#M72390
Thanks for pointing out the post!
However, the situation is for my data, one "date" has multiple observations and I'll need to use all these observations to calculate covariance. So it'll be difficult for me to specify the rolling window in proc expand, which works on datasets that a date only has one observation.
@jl2978 wrote:
Thanks for pointing out the post!
However, the situation is for my data, one "date" has multiple observations and I'll need to use all these observations to calculate covariance. So it'll be difficult for me to specify the rolling window in proc expand, which works on datasets that a date only has one observation.
I'm not clear on how you would use multiple observations on a single date in a rolling window covariance, I don't think you have explained the details. But whatever the method, you would have to program that, and since the link I gave also pointed to some DATA step code, its not unreasonable for you to modify the data step code to handle this.
Thanks for your reply!
The sample data is below. I'll need to calculate covariance of r and r_lag for each cusip_id - trd_exctn_dt pair for a 21 days rolling window (trd_exctn_dt-21,trd_exctn_dt).
Obs	cusip_id	trd_exctn_dt	trd_exctn_tm	r	r_lag
1	000361AH8	20120214	14:02:44	-0.010491	0.008998
2	000361AH8	20120214	14:25:00	0.000964	-0.010491
3	000361AH8	20120214	16:11:01	-0.000790	0.000964
4	000361AH8	20120215	13:09:51	-0.006197	-0.000790
5	000361AH8	20120216	16:00:24	-0.008846	-0.006197
6	000361AH8	20120217	15:41:34	0.024823	-0.008846
7	000361AH8	20120217	15:44:27	0.001146	0.024823
8	000361AH8	20120217	16:03:22	-0.010782	0.001146
9	000361AH8	20120222	12:15:41	-0.002307	-0.010782
10	000361AH8	20120222	12:28:00	0.004831	-0.002307
11	000361AH8	20120222	12:28:09	0.004808	0.004831
12	000361AH8	20120222	12:30:25	-0.009091	0.004808
13	000361AH8	20120223	14:56:17	0.004346	-0.009091
14	000361AH8	20120223	15:11:36	-0.003606	0.004346
15	000361AH8	20120223	15:22:48	0.001206	-0.003606
16	000361AH8	20120227	13:34:41	-0.026506	0.001206
17	000361AH8	20120227	13:34:41	0.009901	-0.026506
18	000361AH8	20120228	14:51:56	0.002451	0.009901
19	000361AH8	20120302	11:33:39	0.001222	0.002451
20	000361AH8	20120308	12:59:49	-0.015385	0.001222
21	000361AH8	20120308	12:59:49	0.001984	-0.015385
22	000361AH8	20120319	9:59:55	0.016767	0.001984
23	000361AH8	20120319	14:12:49	-0.001816	0.016767
24	000361AH8	20120321	13:18:43	-0.022584	-0.001816
25	000361AH8	20120321	14:14:58	-0.000669	-0.022584
26	000361AH8	20120321	14:15:08	-0.000936	-0.000669
27	000361AH8	20120321	14:32:19	0.009684	-0.000936
28	000361AH8	20120321	14:34:24	0.001188	0.009684
29	000361AH8	20120321	15:38:36	-0.013548	0.001188
30	000361AH8	20120321	15:44:21	0.003759	-0.013548
31	000361AH8	20120321	15:46:45	0.002497	0.003759
32	000361AH8	20120321	15:46:57	-0.001245	0.002497
33	000361AH8	20120321	15:47:09	0.001247	-0.001245
34	000361AH8	20120322	10:16:51	-0.003478	0.001247
35	000361AH8	20120322	14:25:47	-0.003628	-0.003478
36	000361AH8	20120323	11:17:27	0.000873	-0.003628
37	000361AH8	20120323	15:29:00	-0.003539	0.000873
38	000361AH8	20120323	15:50:00	-0.002123	-0.003539
39	000361AH8	20120323	15:51:00	0.002894	-0.002123
40	000361AH8	20120323	16:09:32	-0.000985	0.002894
41	000361AH8	20120323	16:11:34	-0.001258	-0.000985
42	000361AH8	20120323	16:25:02	0.002519	-0.001258
43	000361AH8	20120323	16:44:00	-0.002010	0.002519
44	000361AH8	20120327	13:59:25	-0.003651	-0.002010
45	000361AH8	20120327	13:59:31	-0.001263	-0.003651
46	000361AH8	20120328	15:44:19	0.000633	-0.001263the output data will look like
Obs	cusip_id	trd_exctn_dt	cov
1	000361AH8	20120214	XIt's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
