BookmarkSubscribeRSS Feed
jl2978
Fluorite | Level 6

Hi,

 

I have a very large dataset, which has variable "date","x","y". there're multiple "x" - "y" pairs for a date. I need to calculate the rolling covariance of "x" and "y" for a given "date" using ("date"-21,"date") window, requiring at least four "x"-"y" pairs for the calculation and at least one "x"-"y" pairs on that "date".

 

How can I calculate this?

 

Some posts suggest using proc timeseries with outcrosscorr option. But it seems the base data set has only one "x"-"y" pair for a "date". I need to use all "x"-"y" pairs to do the calculation. 

 

Thank you very much!  

5 REPLIES 5
PaigeMiller
Diamond | Level 26

This is a good situation for someone to search the SAS communities for "rolling covariance" or "rolling correlations" as questions about this come up once in a while. For example, here's one such thread: https://communities.sas.com/t5/SAS-Programming/Rolling-pairwise-correlations/m-p/325493#M72390

--
Paige Miller
jl2978
Fluorite | Level 6

Thanks for pointing out the post!

However, the situation is for my data, one "date" has multiple observations and I'll need to use all these observations to calculate covariance. So it'll be difficult for me to specify the rolling window in proc expand, which works on datasets that a date only has one observation. 

PaigeMiller
Diamond | Level 26

@jl2978 wrote:

Thanks for pointing out the post!

However, the situation is for my data, one "date" has multiple observations and I'll need to use all these observations to calculate covariance. So it'll be difficult for me to specify the rolling window in proc expand, which works on datasets that a date only has one observation. 


I'm not clear on how you would use multiple observations on a single date in a rolling window covariance, I don't think you have explained the details. But whatever the method, you would have to program that, and since the link I gave also pointed to some DATA step code, its not unreasonable for you to modify the data step code to handle this.

--
Paige Miller
Reeza
Super User
If you have duplicate data on days you'll need to do it manually. Post some sample data and we can help with the code needed.
jl2978
Fluorite | Level 6

Thanks for your reply!

 

The sample data is below. I'll need to calculate covariance of r and r_lag for each cusip_id - trd_exctn_dt pair for a 21 days rolling window (trd_exctn_dt-21,trd_exctn_dt). 

Obs	cusip_id	trd_exctn_dt	trd_exctn_tm	r	r_lag
1	000361AH8	20120214	14:02:44	-0.010491	0.008998
2	000361AH8	20120214	14:25:00	0.000964	-0.010491
3	000361AH8	20120214	16:11:01	-0.000790	0.000964
4	000361AH8	20120215	13:09:51	-0.006197	-0.000790
5	000361AH8	20120216	16:00:24	-0.008846	-0.006197
6	000361AH8	20120217	15:41:34	0.024823	-0.008846
7	000361AH8	20120217	15:44:27	0.001146	0.024823
8	000361AH8	20120217	16:03:22	-0.010782	0.001146
9	000361AH8	20120222	12:15:41	-0.002307	-0.010782
10	000361AH8	20120222	12:28:00	0.004831	-0.002307
11	000361AH8	20120222	12:28:09	0.004808	0.004831
12	000361AH8	20120222	12:30:25	-0.009091	0.004808
13	000361AH8	20120223	14:56:17	0.004346	-0.009091
14	000361AH8	20120223	15:11:36	-0.003606	0.004346
15	000361AH8	20120223	15:22:48	0.001206	-0.003606
16	000361AH8	20120227	13:34:41	-0.026506	0.001206
17	000361AH8	20120227	13:34:41	0.009901	-0.026506
18	000361AH8	20120228	14:51:56	0.002451	0.009901
19	000361AH8	20120302	11:33:39	0.001222	0.002451
20	000361AH8	20120308	12:59:49	-0.015385	0.001222
21	000361AH8	20120308	12:59:49	0.001984	-0.015385
22	000361AH8	20120319	9:59:55	0.016767	0.001984
23	000361AH8	20120319	14:12:49	-0.001816	0.016767
24	000361AH8	20120321	13:18:43	-0.022584	-0.001816
25	000361AH8	20120321	14:14:58	-0.000669	-0.022584
26	000361AH8	20120321	14:15:08	-0.000936	-0.000669
27	000361AH8	20120321	14:32:19	0.009684	-0.000936
28	000361AH8	20120321	14:34:24	0.001188	0.009684
29	000361AH8	20120321	15:38:36	-0.013548	0.001188
30	000361AH8	20120321	15:44:21	0.003759	-0.013548
31	000361AH8	20120321	15:46:45	0.002497	0.003759
32	000361AH8	20120321	15:46:57	-0.001245	0.002497
33	000361AH8	20120321	15:47:09	0.001247	-0.001245
34	000361AH8	20120322	10:16:51	-0.003478	0.001247
35	000361AH8	20120322	14:25:47	-0.003628	-0.003478
36	000361AH8	20120323	11:17:27	0.000873	-0.003628
37	000361AH8	20120323	15:29:00	-0.003539	0.000873
38	000361AH8	20120323	15:50:00	-0.002123	-0.003539
39	000361AH8	20120323	15:51:00	0.002894	-0.002123
40	000361AH8	20120323	16:09:32	-0.000985	0.002894
41	000361AH8	20120323	16:11:34	-0.001258	-0.000985
42	000361AH8	20120323	16:25:02	0.002519	-0.001258
43	000361AH8	20120323	16:44:00	-0.002010	0.002519
44	000361AH8	20120327	13:59:25	-0.003651	-0.002010
45	000361AH8	20120327	13:59:31	-0.001263	-0.003651
46	000361AH8	20120328	15:44:19	0.000633	-0.001263

the output data will look like 

Obs	cusip_id	trd_exctn_dt	cov
1	000361AH8	20120214	X

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1035 views
  • 2 likes
  • 3 in conversation