how to run the regression on a rolling basis as follows? for each firm (gvkey) and date(datadate), include all the observations for this firm before this date but require at least 8 observations for this regression? the regression code is as follows
data erc; set data.erctest1; proc sort data=erc nodupkey; by gvkey; proc reg data=erc outest=cfsde adjrsq noprint; model car= ue; by gvkey; output out=varone r=residone;
the data is
data WORK.ERCTEST;
infile datalines dsd truncover;
input gvkey:$6. datadate:YYMMDDN8. ue:32. car:32.;
format datadate YYMMDDN8.;
label gvkey="Global Company Key" datadate="Data Date";
datalines;
001690 20070331 . .
001690 20070630 0.0004525206 0.4670161755
001690 20070930 0.0006423839 0.2271965327
001690 20071231 0.0038899365 0.0535349058
001690 20080331 -0.004237645 -0.04493543
001690 20080630 0.0001820518 0.1184123378
001690 20080930 0.0006338696 -0.242361881
001690 20081231 0.0147242809 -0.06718815
001690 20090331 -0.006772771 0.4468636412
001690 20090630 0.0016303555 0.0926635129
001690 20090930 0.0042211539 0.0569200846
001690 20091231 0.0044297225 0.0694110274
001690 20100331 -0.001422126 0.1062908265
001690 20100630 0.0007790463 0.1138357995
001690 20100930 0.0040591522 0.1280910542
001690 20101231 0.0057087261 -0.024709269
001690 20110331 -0.000052753 -0.053340722
001690 20110630 0.0042457609 0.1448117306
001690 20110930 -0.001933107 0.2521745565
001690 20111231 0.0170601425 -0.080865297
001690 20120331 -0.002572389 0.2810152156
001690 20120630 -0.005111778 0.0876576022
001690 20120930 -0.000959221 -0.032848439
001690 20121231 0.0097159076 -0.22854538
001690 20130331 -0.008485084 -0.223406274
001690 20130630 -0.007348195 -0.014162502
001690 20130930 0.0014275724 0.1887835903
001690 20131231 0.0111048837 0.0384054444
001690 20140331 -0.00615956 -0.078924939
001690 20140630 -0.004446851 0.1953453542
001690 20140930 0.0012165497 0.1066788925
001690 20141231 0.0148603652 0.0470219108
001690 20150331 -0.006213387 0.1303489251
001690 20150630 -0.004041365 0.0175582589
001690 20150930 0.0007264318 -0.071783252
001690 20151231 0.012400347 -0.069109732
001690 20160331 -0.013138595 -0.071681597
001690 20160630 -0.005275545 -0.119072014
001690 20160930 0.0020190515 0.2109433449
001690 20161231 0.0145839438 -0.026547759
001690 20170331 -0.009175424 0.1675285819
001690 20170630 -0.003105223 0.0070201798
001690 20170930 0.0025276878 0.0806324686
001690 20171231 0.010873663 -0.077033442
001690 20180331 -0.007527275 0.0769450604
001690 20180630 -0.002568958 0.1174891488
012141 20070331 . .
012141 20070630 -0.006840826 0.039301915
012141 20070930 0.0045501006 0.0791492442
012141 20071231 0.00125861 0.2159948102
012141 20080331 -0.001207337 -0.091396305
012141 20080630 -0.000361478 -0.013992142
012141 20080930 0.0003172004 0.2467016235
012141 20081231 -0.001151606 -0.171371867
012141 20090331 -0.00732306 -0.058326322
012141 20090630 0.0003211438 0.1484024492
012141 20090930 0.0023109721 -0.000563749
012141 20091231 0.0114983925 0.1113820094
012141 20100331 -0.010350052 -0.038707234
012141 20100630 0.0025670507 -0.094669369
012141 20100930 0.0042540329 -0.06558027
012141 20101231 0.005218999 0.0450195831
012141 20110331 -0.006549471 -0.113316172
012141 20110630 0.0029479832 0.0599936985
012141 20110930 -0.000649708 0.1114515078
012141 20111231 0.0040717526 -0.048890014
012141 20120331 -0.005595293 0.0330296286
012141 20120630 -0.021843021 -0.007530868
012141 20120930 0.0197814607 -0.068648255
012141 20121231 0.0085439504 -0.105270689
012141 20130331 -0.001348278 0.0355996234
012141 20130630 -0.00378879 0.1366469038
012141 20130930 0.0010044828 -0.111795675
012141 20131231 0.0042318432 -0.009889401
012141 20140331 -0.002652274 0.0791397488
012141 20140630 -0.003050357 0.066885466
012141 20140930 -0.000188136 0.0338976295
012141 20141231 0.0034658358 -0.001447478
012141 20150331 -0.002661945 -0.113894517
012141 20150630 -0.023081782 0.11049779
012141 20150930 0.0229078022 0.0872105183
012141 20151231 0.0002638288 0.204585892
012141 20160331 -0.002903418 -0.039893154
012141 20160630 -0.001586843 -0.090012729
012141 20160930 0.0056762626 0.0723214928
012141 20161231 0.0012491095 0.0365159269
012141 20170331 -0.001535477 0.033560407
012141 20170630 0.0048615462 0.0584440048
012141 20170930 -0.002596238 0.0345231002
012141 20171231 -0.019539189 0.1052017727
012141 20180331 0.0195564312 0.0792267377
012141 20180630 0.0019140615 0.0157609209
;;;;
First, the code
data erc;
set data.erctest1;
proc sort data=erc nodupkey;
by gvkey;
has a major inefficiency, and also a major mistake.
The inefficiency. You don't need the first data step. You can sort from one dataset to another. I.e. fro data.erctest1 to erc, as in
proc sort data=data.erctest1 out=erc ;
by gvkey datadate;
run;
Also you had "nodupkey" and "by gvkey" which would generate 1 record per gvkey. You want one record per gvkey/date, correct? Hence, I have "by gvkey datadate". And I didn't bother with nodupkey, assuming you only have one record per gvkey/datadate.
Now, to get rolling window regressions, I see that you want minimum rolling window size of 8, but what is the maximum size rolling window that you want? You could do this by brute force, by repeating each observation a sufficient number of times (and then another sort), but I suspect you'll be better off generating rolling sum-of-squares-and-cross-products, and submitting that to the regression. But first, please provide the maximum window size you want.
And also, have you insured that there are no holes in your time series? If there are, do you want window size based on date-range, or based simply on record count?
thank you! You are right that there is a mistake in the proc sort. Should be proc sort by gvkey datadate.
Yes I'd like the minimum # of obs in a regression of 8 observations and I'd like to use all the observations prior to the certain 'gvkey datadate'. For example, if for one gvkey datadate, I have 9 quarter of data prior to this date, I'd like to put them all in the regression. And if I have 10, I'd like to use all of them PRIOR to this date. The only requirement is I 'd like to have at least 8 observations.
How can I do this?
If you have sas/ets (which includes proc expand) you could consider the appendix of the paper @PGStats referred to. I'm "familiar" with the paper since I wrote it. In short the appendix makes a series of "rolling" window sums-of-squares-and-cross-products which can be submitted to proc reg, instead of the original data. At larger window sizes, this is an efficient approach. Although your 11 years of quarterly data suggest your average window size is a relatively small 26 observations (min size 8, max size 44).
What the program does is:
Sorry Mark for not giving you credit. I hadn't checked the name of the author. Cheers.
For a very efficient method to do all those regressions with proc expand and proc reg, look at Appendix 1 in
https://lexjansen.com/nesug/nesug12/fi/fi08.pdf
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.