Programming the statistical procedures from SAS

Mutual Fund Regression

Reply
Occasional Contributor
Posts: 5

Mutual Fund Regression

Please I have 8631 unique mutual funds, as a result of their different risk exposure, I run regression per fund, outputting their parameter estimates, but at the end of the day, I want to have one estimate for all and a single t value.

But for the coefficients of these 8631 funds, I take an average of them to serve as a single coefficient (I'm not too sure if this is right). for the t values, it will be wrong to just use an average of all the t values of the 8631 funds. I need help to have to find just a single coefficient and t value for these 8631 funds, even though I am running the regression by fund. Thank you. attached is what I have..

ods listing close;
ods noresults;
ods output parameterestimates=prince.coefew1;
proc reg data=prince.Allfund;
by CRSP_FUNDNO;
model MRETRF=mktrf smb hml umd;
run; 

data prince.betaestew;
set prince.Coefew1;
if variable = 'mktrf';
varerr=stderr**2;
rename estimate=betaestew;
keep CRSP_FUNDNO Variable Estimate StdErr varerr tvalue;
run;
Super User
Posts: 23,700

Re: Mutual Fund Regression

Posted in reply to Princeelvisa

That doesn't seem correct to me. Why not remove the BY statement and run that regression model?

 

proc reg data=prince.Allfund;

model MRETRF=mktrf smb hml umd;
run; 

If you want to account for the different funds you could include that as a variable though it may not produce what you want.

Occasional Contributor
Posts: 5

Re: Mutual Fund Regression

I cannot remove the by statement because, each fund has different risk exposure, then the need and correct way is to run by fund no. 

Super User
Posts: 23,700

Re: Mutual Fund Regression

Posted in reply to Princeelvisa

Princeelvisa wrote:

I cannot remove the by statement because, each fund has different risk exposure, then the need and correct way is to run by fund no. 


Then you'll get estimates for each fund, if each has its own risk exposure then why do you want an overall estimate?  The average of the estimates will not be the overall risk. 

 

 

Occasional Contributor
Posts: 5

Re: Mutual Fund Regression

I want the overall because I'm studying the overall, by the regression needs to be run by fund, before ending up in the overall. thanks

Respected Advisor
Posts: 3,000

Re: Mutual Fund Regression

Posted in reply to Princeelvisa

Princeelvisa wrote:

I want the overall because I'm studying the overall, by the regression needs to be run by fund, before ending up in the overall. thanks


Using a BY statement is not the way to get an overall regression. I'm not sure why you think a BY statement is needed here. Please explain in more detail.

--
Paige Miller
Occasional Contributor
Posts: 5

Re: Mutual Fund Regression

Posted in reply to PaigeMiller

Thank so much, using a by statement I intend to run the regression by each fund to obtain their respective estimates in a new dataset, the by statement run the regression for individual fund as a result of each fund having different risk exposure therefore the need to use the by statement. I heard I use "loop'' to aid in running the regressions. By my major concern is, after keeping the estimates in a separate dataset, I fund the average of the parameter estimates to serve for the whole, but doing the same by averaging the t values to obtain a single number for the whole I thing will be inappropriate then how do I get a single t value for the whole after running the regression by each fund? Thanks

Respected Advisor
Posts: 3,000

Re: Mutual Fund Regression

Posted in reply to Princeelvisa

Princeelvisa wrote:

Thank so much, using a by statement I intend to run the regression by each fund to obtain their respective estimates in a new dataset, the by statement run the regression for individual fund as a result of each fund having different risk exposure therefore the need to use the by statement. I heard I use "loop'' to aid in running the regressions. By my major concern is, after keeping the estimates in a separate dataset, I fund the average of the parameter estimates to serve for the whole, but doing the same by averaging the t values to obtain a single number for the whole I thing will be inappropriate then how do I get a single t value for the whole after running the regression by each fund? Thanks


I would not recommend this.

 

The average of the slopes is not a way to get a good "overall" slope. Same thing applies to t-values.

 

There's no reason you can't do both -- run individual regressions with the BY statement to get estimates for each fund, and then run the regression without the BY statement to get the overall slope and t-values.

--
Paige Miller
Occasional Contributor
Posts: 5

Re: Mutual Fund Regression

Posted in reply to PaigeMiller

Capture.PNGthis is the result of not running by the "by statement" the t values look weird to me

Respected Advisor
Posts: 3,000

Re: Mutual Fund Regression

Posted in reply to Princeelvisa

Weird? In what way? State what is weird about it.

 

Lots of people have used SAS PROC REG for decades, and I am not aware of any previous claims of incorrect t-value being computed by PROC REG.

--
Paige Miller
Trusted Advisor
Posts: 1,337

Re: Mutual Fund Regression

Posted in reply to Princeelvisa

The high value of t for the mktrf factor (which I presume is overall market-return minus risk-free-return, probably determined as sp500 return minus T-bill return) when you pool all the mutual funds simply says that the association of the "average" mutual fund is undeniably associated with mktrf.

 

And the parameter value (.95....) says that the class of portfolios known as mutual funds track the market very nearly on a 1:1 basis.  What is surprising about either of these numbers?  If effectively states that the risk premium for mutual funds is related to the risk premium for the overall market.  Presumably your sample of mutual funds are mostly invested in offerings in the self-same market.

 

 

Super User
Posts: 23,700

Re: Mutual Fund Regression

Posted in reply to Princeelvisa

Princeelvisa wrote:

Capture.PNGthis is the result of not running by the "by statement" the t values look weird to me


Did you standardize your variables before regression?

 

Also, one possibilty. Cluster your data with respect to the mutual funds and reduce your dimensionality of the stocks to clusters, so you reduce the 8631 factors to say 10 or 20 and then use that as a factor in your analysis. I'm also assuming there's some time component to this data so you may need to be working with time series regression models. Otherwise, if you have one point for each mutual fund you definitely cannot use the BY statement. 

 

Your model would end up as:

 

proc glm data=stocks;
class cluster;
model dependent = cluster mktrf smb hmm umd stkmv stkmvew;
run;
Ask a Question
Discussion stats
  • 11 replies
  • 194 views
  • 4 likes
  • 4 in conversation