hello everyone
I am new to sas and my question is a bit basic.
I want to extract for each fund_id in each month , the fama french alpha using 60 month previous observations using this model
mretf = alpha +b1 MKTRF+b2 SMB + b3 HML+ b4 UMD; my sample period is 2001-2016 and the starting period of my study is 01/01/2005 .
I have tried the following code and I think it works well but my question is about the index ( I=mod(N-1,60)+1). Since the table rf4 created by sas,skip the first observation at each new id. I have posted here for one fund an example of the data.
For 31 jan 2005 , the table rf4 created by sas skip the first obs( 31 jan 2000) and begins by 29 fev 2000 until 31 jan 2005.
Your help is much appreciated. Thanks
CALDT | FACTSET_FUND_ID | mretf | SMB | HML | MKTRF | UMD |
31-Jan-00 | A | 0.02107 | 0.0501 | -0.0045 | -0.0474 | 0.0188 |
29-Feb-00 | A | 0.25627 | 0.2208 | -0.1049 | 0.0245 | 0.1838 |
31-Mar-00 | A | -0.08662 | -0.1717 | 0.0786 | 0.052 | -0.068 |
28-Apr-00 | A | -0.11902 | -0.0777 | 0.0857 | -0.064 | -0.0852 |
31-May-00 | A | -0.09957 | -0.0494 | 0.0239 | -0.0442 | -0.0906 |
30-Jun-00 | A | 0.085008 | 0.1395 | -0.1003 | 0.0464 | 0.165 |
31-Jul-00 | A | -0.02268 | -0.0278 | 0.0815 | -0.0251 | -0.0012 |
31-Aug-00 | A | 0.085901 | -0.0111 | -0.0069 | 0.0703 | 0.057 |
29-Sep-00 | A | -0.09272 | -0.014 | 0.0623 | -0.0545 | 0.0215 |
31-Oct-00 | A | -0.07226 | -0.0377 | 0.0555 | -0.0276 | -0.0463 |
30-Nov-00 | A | -0.16127 | -0.0277 | 0.1129 | -0.1072 | -0.0244 |
29-Dec-00 | A | 0.029903 | 0.0096 | 0.073 | 0.0119 | 0.0673 |
31-Jan-01 | A | 0.066976 | 0.0656 | -0.0488 | 0.0313 | -0.2501 |
28-Feb-01 | A | -0.15926 | -0.0072 | 0.1291 | -0.1005 | 0.1248 |
29-Mar-01 | A | -0.10098 | 0.0039 | 0.0649 | -0.0726 | 0.0838 |
30-Apr-01 | A | 0.085569 | 0.005 | -0.0469 | 0.0794 | -0.0812 |
31-May-01 | A | 0.002938 | 0.026 | 0.0313 | 0.0072 | 0.0217 |
29-Jun-01 | A | -0.02129 | 0.0603 | -0.0109 | -0.0194 | 0.0034 |
31-Jul-01 | A | -0.04485 | -0.0435 | 0.0561 | -0.0213 | 0.055 |
31-Aug-01 | A | -0.05641 | 0.025 | 0.0256 | -0.0646 | 0.0554 |
28-Sep-01 | A | -0.12993 | -0.0611 | 0.0156 | -0.0925 | 0.1155 |
31-Oct-01 | A | 0.047347 | 0.0756 | -0.0802 | 0.0246 | -0.0838 |
30-Nov-01 | A | 0.032566 | -0.004 | 0.02 | 0.0754 | -0.0857 |
31-Dec-01 | A | 0.018145 | 0.0456 | 0.0108 | 0.0161 | -0.0003 |
31-Jan-02 | A | -0.01629 | 0.0119 | 0.0335 | -0.0144 | 0.0376 |
28-Feb-02 | A | -0.01538 | -0.0114 | 0.0251 | -0.0229 | 0.0679 |
28-Mar-02 | A | 0.045198 | 0.0426 | 0.0111 | 0.0424 | -0.0164 |
30-Apr-02 | A | -0.00575 | 0.0594 | 0.0392 | -0.052 | 0.0798 |
31-May-02 | A | -0.0054 | -0.0322 | 0.017 | -0.0138 | 0.0304 |
28-Jun-02 | A | -0.04528 | 0.0426 | 0.0012 | -0.0721 | 0.0614 |
31-Jul-02 | A | -0.09043 | -0.0537 | -0.0352 | -0.0818 | 0.0338 |
30-Aug-02 | A | -0.00355 | -0.0243 | 0.0253 | 0.005 | 0.0175 |
30-Sep-02 | A | -0.06487 | 0.0255 | 0.0126 | -0.1035 | 0.0915 |
31-Oct-02 | A | 0.024372 | -0.0286 | -0.054 | 0.0784 | -0.0546 |
29-Nov-02 | A | 0.032228 | 0.0295 | -0.0095 | 0.0596 | -0.1628 |
31-Dec-02 | A | -0.0285 | 0 | 0.0221 | -0.0576 | 0.0962 |
31-Jan-03 | A | -0.02095 | 0.0144 | -0.0083 | -0.0257 | 0.0158 |
28-Feb-03 | A | -0.01952 | -0.0035 | -0.0147 | -0.0188 | 0.0125 |
31-Mar-03 | A | 0.030325 | 0.01 | -0.0193 | 0.0109 | 0.0152 |
30-Apr-03 | A | 0.063457 | 0.0061 | 0.0111 | 0.0822 | -0.0942 |
30-May-03 | A | 0.070241 | 0.0484 | -0.001 | 0.0605 | -0.1076 |
30-Jun-03 | A | 0.021306 | 0.016 | 0.0057 | 0.0142 | -0.0102 |
31-Jul-03 | A | 0.031667 | 0.0534 | -0.0102 | 0.0235 | -0.0029 |
29-Aug-03 | A | 0.036795 | 0.0265 | 0.021 | 0.0234 | -0.0054 |
30-Sep-03 | A | 0.010955 | 0.0081 | 0.0001 | -0.0124 | -0.0019 |
31-Oct-03 | A | 0.06943 | 0.0281 | 0.0193 | 0.0608 | 0.0375 |
28-Nov-03 | A | 0.023784 | 0.0216 | 0.0196 | 0.0135 | 0.0163 |
31-Dec-03 | A | 0.051092 | -0.0305 | 0.0235 | 0.0429 | -0.0569 |
30-Jan-04 | A | 0.037932 | 0.0274 | 0.0196 | 0.0215 | 0.0259 |
27-Feb-04 | A | 0.032609 | -0.0143 | 0.0048 | 0.014 | -0.011 |
31-Mar-04 | A | -0.00011 | 0.0177 | 0.0025 | -0.0132 | 0.002 |
30-Apr-04 | A | -0.02934 | -0.0212 | -0.0262 | -0.0183 | -0.0536 |
31-May-04 | A | 0.013562 | -0.0019 | -0.0035 | 0.0117 | 0.0165 |
30-Jun-04 | A | 0.016311 | 0.0223 | 0.0135 | 0.0186 | 0.0207 |
30-Jul-04 | A | -0.0625 | -0.038 | 0.0409 | -0.0406 | -0.023 |
31-Aug-04 | A | 0.011417 | -0.0159 | 0.0096 | 0.0008 | -0.015 |
30-Sep-04 | A | 0.04196 | 0.0305 | -0.0024 | 0.016 | 0.0526 |
29-Oct-04 | A | 0.041746 | 0.0033 | -0.0056 | 0.0143 | -0.0149 |
30-Nov-04 | A | 0.070328 | 0.039 | 0.0181 | 0.0454 | 0.0322 |
31-Dec-04 | A | 0.050927 | 0.0016 | -0.0004 | 0.0343 | -0.0283 |
31-Jan-05 | A | -0.01264 | -0.0153 | 0.0195 | -0.0276 | 0.0312 |
data rf4 / view= rf4;
array _X1 {60} _temporary_ ;
array _X2 {60} _temporary_ ;
array _X3 {60} _temporary_ ;
array _X4 {60} _temporary_ ;
array _Y {60} _temporary_ ;
set rf3m;
by FUND_ID;
retain N 0;
N = ifn(first.FUND_ID,1,N+1);
I=mod(N-1,60)+1;
_X1{I}=MKTRF;
_X2{I}=SMB;
_X3{I}=HML;
_X4{I}=UMD;
_Y{I}=mretf;
if N>=60 then do I= 1 to 60;
MKTRF=_X1{I};
SMB=_X2{I};
HML=_X3{I};
UMD=_X4{I};
mretf=_Y{I};
output;
end;
run;
proc reg data=rf4 outest=resultsalpha noprint;
by FUND_ID caldt;
model mretf = MKTRF SMB HML UMD;
run;
Hello,
Since I don't really see a question, I can not give you an answer but only a general advice.
When you want to figure out what is going on when your program is executed, the best
is to work with a simplified example. For instance, you can use only one explanatory variable and
a rolling window of 4 observations.
data have;
input id y x;
cards;
1 12 5
1 23 8
1 10 2
1 15 7
1 31 16
1 17 9
1 22 12
;
run;
data prepare;
array _X {4} _temporary_ ;
array _Y {4} _temporary_ ;
set have;
by id;
retain N 0;
N = ifn(first.id,1,N+1);
I=mod(N-1,4)+1;
_X{I}=x;
_Y{I}=y;
if N>=4 then do I= 1 to 4;
x=_X{I};
y=_Y{I};
output;
end;
run;
The output dataset we obtain is :
Obs | id | y | x | N | I |
---|---|---|---|---|---|
1 | 1 | 12 | 5 | 4 | 1 |
2 | 1 | 23 | 8 | 4 | 2 |
3 | 1 | 10 | 2 | 4 | 3 |
4 | 1 | 15 | 7 | 4 | 4 |
5 | 1 | 31 | 16 | 5 | 1 |
6 | 1 | 23 | 8 | 5 | 2 |
7 | 1 | 10 | 2 | 5 | 3 |
8 | 1 | 15 | 7 | 5 | 4 |
9 | 1 | 31 | 16 | 6 | 1 |
10 | 1 | 17 | 9 | 6 | 2 |
11 | 1 | 10 | 2 | 6 | 3 |
12 | 1 | 15 | 7 | 6 | 4 |
13 | 1 | 31 | 16 | 7 | 1 |
14 | 1 | 17 | 9 | 7 | 2 |
15 | 1 | 22 | 12 | 7 | 3 |
16 | 1 | 15 | 7 | 7 | 4 |
If you look at obervations 5 to 8, you see that correspond to observations 2 to 5 of the orginal dataset
(though not in the same order, the 5th observation appearing first).
Obs 9 to 12 correspond to obs 3 to 6 of the original dataset and so on.
The program thus seems to do what you want, that is preparing the data for a rolling regression
with proc reg.
The fact that you received no answer yet may indicate that your question is unclear.
Please reformulate your question as I am sure that many contributors on this forum
would be able to answer.
Hello
Thank you for your reply.
Maybe you are right I should use a simplified example .
I have posted the sas code. My goal is to extract the beta of x . My question is that if I have missing values for my X and I want to run the regression only if I have 4 previous non missing values of X how can I do that ??
Since with these codes I obtain betas for date_code 04 ,05,06,07 , but I want only the beta of date_code 07 since it have full 4 previous observation.
Thank you in advance for you help
data have1;
input id $date_code y x;
cards;
1 01 12 .
1 02 23 .
1 03 10 .
1 04 15 7
1 05 31 16
1 06 17 9
1 07 22 12
;
run;
data prepare1;
array _X {4} _temporary_ ;
array _Y {4} _temporary_ ;
set have1;
by id;
retain N 0;
N = ifn(first.id,1,N+1);
I=mod(N-1,4)+1;
_X{I}=x;
_Y{I}=y;
if N>=4 then do I= 1 to 4;
x=_X{I};
y=_Y{I};
output;
end;
run;
proc reg data=prepare1 outest=result noprint;
by id date_code ;
model y = x;
run;
quit;
Hello,
data have1;
input id $date_code y x;
cards;
1 01 12 .
1 02 23 .
1 03 10 .
1 04 15 7
1 05 31 16
1 06 17 9
1 07 22 12
1 08 12 21
1 09 23 .
1 10 15 7
1 11 31 16
1 12 17 9
1 13 22 12
1 14 12 21
1 15 12 5
1 16 23 .
1 17 8 2
1 18 12 14
1 19 10 5
1 20 16 .
;
run;
/* 1/ We count the number of consecutive complete observations */
data have2 (where=(x ne .));
set have1;
by id;
retain sample size;
if first.id then sample=0;
x1=lag(x);
/* new sample if current obs is OK while previous is not */
if x ne . and x1=. then do;
sample=sample+1;
size=0;
end;
/* We increase the size for each complete observation */
if x ne . then size=size+1;
run;
/* 2/ We keep only samples with size greater or equal 4 */
proc sql noprint;
CREATE TABLE have3 AS
SELECT * FROM have2
GROUP BY ID, sample
HAVING max(size) ge 4
ORDER BY ID, sample, date_code
;
quit;
/* 3/ We prepare the data for rolling regressions */
data prepare1;
array _X {4} _temporary_ ;
array _Y {4} _temporary_ ;
set have3;
by id sample;
retain N 0;
N = ifn(first.sample,1,N+1);
I=mod(N-1,4)+1;
_X{I}=x;
_Y{I}=y;
if N>=4 then do I= 1 to 4;
x=_X{I};
y=_Y{I};
output;
end;
run;
/* 4/ We run the regressions */
proc reg data=prepare1 outest=result noprint;
by id date_code ;
model y = x;
run;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.