BookmarkSubscribeRSS Feed
odette
Fluorite | Level 6

hello everyone

 

I am new to sas and my question is a bit basic.

I want to extract for each fund_id in each month , the  fama french alpha using 60 month previous observations using this model

 mretf = alpha +b1 MKTRF+b2 SMB + b3 HML+ b4 UMD; my sample period is 2001-2016 and the starting period of my study is 01/01/2005 .

 

I have tried the following code and I think it works well but my question is about the index  ( I=mod(N-1,60)+1). Since the table rf4 created by sas,skip the first observation at each new id. I have posted here for one fund an example of the data.

For 31 jan 2005 , the table rf4 created by sas  skip the first obs( 31 jan 2000) and begins by 29 fev 2000  until 31 jan 2005.

Your help is much appreciated. Thanks

 

CALDTFACTSET_FUND_IDmretfSMBHMLMKTRFUMD
31-Jan-00A0.021070.0501-0.0045-0.04740.0188
29-Feb-00A0.256270.2208-0.10490.02450.1838
31-Mar-00A-0.08662-0.17170.07860.052-0.068
28-Apr-00A-0.11902-0.07770.0857-0.064-0.0852
31-May-00A-0.09957-0.04940.0239-0.0442-0.0906
30-Jun-00A0.0850080.1395-0.10030.04640.165
31-Jul-00A-0.02268-0.02780.0815-0.0251-0.0012
31-Aug-00A0.085901-0.0111-0.00690.07030.057
29-Sep-00A-0.09272-0.0140.0623-0.05450.0215
31-Oct-00A-0.07226-0.03770.0555-0.0276-0.0463
30-Nov-00A-0.16127-0.02770.1129-0.1072-0.0244
29-Dec-00A0.0299030.00960.0730.01190.0673
31-Jan-01A0.0669760.0656-0.04880.0313-0.2501
28-Feb-01A-0.15926-0.00720.1291-0.10050.1248
29-Mar-01A-0.100980.00390.0649-0.07260.0838
30-Apr-01A0.0855690.005-0.04690.0794-0.0812
31-May-01A0.0029380.0260.03130.00720.0217
29-Jun-01A-0.021290.0603-0.0109-0.01940.0034
31-Jul-01A-0.04485-0.04350.0561-0.02130.055
31-Aug-01A-0.056410.0250.0256-0.06460.0554
28-Sep-01A-0.12993-0.06110.0156-0.09250.1155
31-Oct-01A0.0473470.0756-0.08020.0246-0.0838
30-Nov-01A0.032566-0.0040.020.0754-0.0857
31-Dec-01A0.0181450.04560.01080.0161-0.0003
31-Jan-02A-0.016290.01190.0335-0.01440.0376
28-Feb-02A-0.01538-0.01140.0251-0.02290.0679
28-Mar-02A0.0451980.04260.01110.0424-0.0164
30-Apr-02A-0.005750.05940.0392-0.0520.0798
31-May-02A-0.0054-0.03220.017-0.01380.0304
28-Jun-02A-0.045280.04260.0012-0.07210.0614
31-Jul-02A-0.09043-0.0537-0.0352-0.08180.0338
30-Aug-02A-0.00355-0.02430.02530.0050.0175
30-Sep-02A-0.064870.02550.0126-0.10350.0915
31-Oct-02A0.024372-0.0286-0.0540.0784-0.0546
29-Nov-02A0.0322280.0295-0.00950.0596-0.1628
31-Dec-02A-0.028500.0221-0.05760.0962
31-Jan-03A-0.020950.0144-0.0083-0.02570.0158
28-Feb-03A-0.01952-0.0035-0.0147-0.01880.0125
31-Mar-03A0.0303250.01-0.01930.01090.0152
30-Apr-03A0.0634570.00610.01110.0822-0.0942
30-May-03A0.0702410.0484-0.0010.0605-0.1076
30-Jun-03A0.0213060.0160.00570.0142-0.0102
31-Jul-03A0.0316670.0534-0.01020.0235-0.0029
29-Aug-03A0.0367950.02650.0210.0234-0.0054
30-Sep-03A0.0109550.00810.0001-0.0124-0.0019
31-Oct-03A0.069430.02810.01930.06080.0375
28-Nov-03A0.0237840.02160.01960.01350.0163
31-Dec-03A0.051092-0.03050.02350.0429-0.0569
30-Jan-04A0.0379320.02740.01960.02150.0259
27-Feb-04A0.032609-0.01430.00480.014-0.011
31-Mar-04A-0.000110.01770.0025-0.01320.002
30-Apr-04A-0.02934-0.0212-0.0262-0.0183-0.0536
31-May-04A0.013562-0.0019-0.00350.01170.0165
30-Jun-04A0.0163110.02230.01350.01860.0207
30-Jul-04A-0.0625-0.0380.0409-0.0406-0.023
31-Aug-04A0.011417-0.01590.00960.0008-0.015
30-Sep-04A0.041960.0305-0.00240.0160.0526
29-Oct-04A0.0417460.0033-0.00560.0143-0.0149
30-Nov-04A0.0703280.0390.01810.04540.0322
31-Dec-04A0.0509270.0016-0.00040.0343-0.0283
31-Jan-05A-0.01264-0.01530.0195-0.02760.0312

 

data rf4  / view= rf4;
 array _X1 {60} _temporary_ ;
 array _X2 {60} _temporary_ ;
 array _X3 {60} _temporary_ ;
 array _X4 {60} _temporary_ ;
 array _Y {60} _temporary_ ;
 set rf3m;
 by  FUND_ID;
 retain N 0;
 N = ifn(first.FUND_ID,1,N+1);
 I=mod(N-1,60)+1;
 _X1{I}=MKTRF;
 _X2{I}=SMB;
 _X3{I}=HML;
 _X4{I}=UMD;
 _Y{I}=mretf;
 if N>=60 then do I= 1 to 60;
 MKTRF=_X1{I};
 SMB=_X2{I};
 HML=_X3{I};
 UMD=_X4{I};
 mretf=_Y{I};
 output;
 end;
run;

proc reg data=rf4 outest=resultsalpha noprint;
    by FUND_ID caldt;
    model mretf = MKTRF SMB HML UMD;
run;
3 REPLIES 3
gamotte
Rhodochrosite | Level 12

Hello,

 

Since I don't really see a question, I can not give you an answer but only a general advice.

When you want to figure out what is going on when your program is executed, the best

is to work with a simplified example. For instance, you can use only one explanatory variable and

a rolling window of 4 observations.

 

 

data have;
input id y x;
cards;
1 12 5
1 23 8
1 10 2
1 15 7
1 31 16
1 17 9
1 22 12
;
run;

data prepare;
    array _X {4} _temporary_ ;
    array _Y {4} _temporary_ ;
    set have;
    by  id;
    retain N 0;
    N = ifn(first.id,1,N+1);

    I=mod(N-1,4)+1;

    _X{I}=x;
    _Y{I}=y;
    if N>=4 then do I= 1 to 4;
        x=_X{I};
        y=_Y{I};
        output;
    end;
run;

The output dataset we obtain is :

 

 

Obs id y x N I
1 1 12 5 4 1
2 1 23 8 4 2
3 1 10 2 4 3
4 1 15 7 4 4
5 1 31 16 5 1
6 1 23 8 5 2
7 1 10 2 5 3
8 1 15 7 5 4
9 1 31 16 6 1
10 1 17 9 6 2
11 1 10 2 6 3
12 1 15 7 6 4
13 1 31 16 7 1
14 1 17 9 7 2
15 1 22 12 7 3
16 1 15 7 7 4

 

If you look at obervations 5 to 8, you see that correspond to observations 2 to 5 of the orginal dataset

(though not in the same order, the 5th observation appearing first).

Obs 9 to 12 correspond to obs 3 to 6 of the original dataset and so on.

The program thus seems to do what you want, that is preparing the data for a rolling regression

with proc reg.

 

The fact that you received no answer yet may indicate that your question is unclear.

Please reformulate your question as I am sure that many contributors on this forum

would be able to answer.

odette
Fluorite | Level 6

Hello 

Thank you for your reply.

Maybe you are right I should use a simplified example .

I have posted the sas code. My goal is to  extract the beta of x . My question is that if I have missing values for my X and I want to run the regression only if I have 4 previous non missing values of X how  can I do that ??

Since with these codes I obtain  betas for date_code 04 ,05,06,07 , but I want only the beta of date_code 07 since it have full 4 previous observation.

 

Thank you in advance for you help

 

 

 

data have1;
input id $date_code y x;
cards;
1 01 12 .
1 02 23 .
1 03 10 .
1 04 15 7
1 05 31 16
1 06 17 9
1 07 22 12
;
run;

data prepare1;
    array _X {4} _temporary_ ;
    array _Y {4} _temporary_ ;
    set have1;
    by  id;
    retain N 0;
    N = ifn(first.id,1,N+1);

    I=mod(N-1,4)+1;

    _X{I}=x;
    _Y{I}=y;
    if N>=4 then do I= 1 to 4;
        x=_X{I};
        y=_Y{I};
        output;
    end;
run;

proc reg data=prepare1  outest=result noprint;
    by id date_code ;
    model y = x;
run;
quit;
gamotte
Rhodochrosite | Level 12

Hello,

 

data have1;
input id $date_code y x;
cards;
1 01 12 .
1 02 23 .
1 03 10 .
1 04 15 7
1 05 31 16
1 06 17 9
1 07 22 12
1 08 12 21
1 09 23 .
1 10 15 7
1 11 31 16
1 12 17 9
1 13 22 12
1 14 12 21
1 15 12 5
1 16 23 .
1 17 8 2
1 18 12 14
1 19 10 5
1 20 16 .
;
run;

/* 1/ We count the number of consecutive complete observations */
data have2 (where=(x ne .));
	set have1;
	by id;
	retain sample size;

	if first.id then sample=0;

	x1=lag(x);

	/* new sample if current obs is OK while previous is not */
	if x ne . and x1=. then do;
		sample=sample+1;
		size=0;
	end;

	/* We increase the size for each complete observation */
	if x ne . then size=size+1;
run;

/* 2/ We keep only samples with size greater or equal 4 */
proc sql noprint;
	CREATE TABLE have3 AS
	SELECT * FROM have2
	GROUP BY ID, sample
	HAVING max(size) ge 4
	ORDER BY ID, sample, date_code
	;
quit;
 
/* 3/ We prepare the data for rolling regressions */
data prepare1;
    array _X {4} _temporary_ ;
    array _Y {4} _temporary_ ;
    set have3;
    by  id sample;
    retain N 0;
    N = ifn(first.sample,1,N+1);

    I=mod(N-1,4)+1;

    _X{I}=x;
    _Y{I}=y;
    if N>=4 then do I= 1 to 4;
        x=_X{I};
        y=_Y{I};
        output;
    end;
run;

/* 4/ We run the regressions */
proc reg data=prepare1  outest=result noprint;
    by id date_code ;
    model y = x;
run;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 995 views
  • 2 likes
  • 2 in conversation