turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Mutual Fund Regression

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-16-2018 09:35 AM

Please I have 8631 unique mutual funds, as a result of their different risk exposure, I run regression per fund, outputting their parameter estimates, but at the end of the day, I want to have one estimate for all and a single t value.

But for the coefficients of these 8631 funds, I take an average of them to serve as a single coefficient (I'm not too sure if this is right). for the t values, it will be wrong to just use an average of all the t values of the 8631 funds. I need help to have to find just a single coefficient and t value for these 8631 funds, even though I am running the regression by fund. Thank you. attached is what I have..

```
ods listing close;
ods noresults;
ods output parameterestimates=prince.coefew1;
proc reg data=prince.Allfund;
by CRSP_FUNDNO;
model MRETRF=mktrf smb hml umd;
run;
data prince.betaestew;
set prince.Coefew1;
if variable = 'mktrf';
varerr=stderr**2;
rename estimate=betaestew;
keep CRSP_FUNDNO Variable Estimate StdErr varerr tvalue;
run;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Princeelvisa

02-16-2018 11:05 AM

That doesn't seem correct to me. Why not remove the BY statement and run that regression model?

```
proc reg data=prince.Allfund;
model MRETRF=mktrf smb hml umd;
run;
```

If you want to account for the different funds you could include that as a variable though it may not produce what you want.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

02-16-2018 11:10 AM

I cannot remove the by statement because, each fund has different risk exposure, then the need and correct way is to run by fund no.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Princeelvisa

02-16-2018 11:13 AM

Princeelvisa wrote:

Then you'll get estimates for each fund, if each has its own risk exposure then why do you want an overall estimate? The average of the estimates will not be the overall risk.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

02-16-2018 11:19 AM

I want the overall because I'm studying the overall, by the regression needs to be run by fund, before ending up in the overall. thanks

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Princeelvisa

02-16-2018 11:42 AM

Princeelvisa wrote:

Using a BY statement is not the way to get an overall regression. I'm not sure why you think a BY statement is needed here. Please explain in more detail.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

02-17-2018 09:39 AM

Thank so much, using a by statement I intend to run the regression by each fund to obtain their respective estimates in a new dataset, the by statement run the regression for individual fund as a result of each fund having different risk exposure therefore the need to use the by statement. I heard I use "loop'' to aid in running the regressions. By my major concern is, after keeping the estimates in a separate dataset, I fund the average of the parameter estimates to serve for the whole, but doing the same by averaging the t values to obtain a single number for the whole I thing will be inappropriate then how do I get a single t value for the whole after running the regression by each fund? Thanks

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Princeelvisa

02-17-2018 09:54 AM

Princeelvisa wrote:

I would not recommend this.

The average of the slopes is not a way to get a good "overall" slope. Same thing applies to t-values.

There's no reason you can't do both -- run individual regressions with the BY statement to get estimates for each fund, and then run the regression without the BY statement to get the overall slope and t-values.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

02-17-2018 10:06 AM

this is the result of not running by the "by statement" the t values look weird to me

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Princeelvisa

02-17-2018 10:16 AM

Weird? In what way? State what is weird about it.

Lots of people have used SAS PROC REG for **decades**, and I am not aware of any previous claims of incorrect t-value being computed by PROC REG.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Princeelvisa

02-17-2018 03:14 PM

The high value of t for the mktrf factor (which I presume is overall market-return minus risk-free-return, probably determined as sp500 return minus T-bill return) when you pool all the mutual funds simply says that the association of the "average" mutual fund is undeniably associated with mktrf.

And the parameter value (.95....) says that the class of portfolios known as mutual funds track the market very nearly on a 1:1 basis. What is surprising about either of these numbers? If effectively states that the risk premium for mutual funds is related to the risk premium for the overall market. Presumably your sample of mutual funds are mostly invested in offerings in the self-same market.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Princeelvisa

02-17-2018 04:25 PM

Princeelvisa wrote:

this is the result of not running by the "by statement" the t values look weird to me

Did you standardize your variables before regression?

Also, one possibilty. Cluster your data with respect to the mutual funds and reduce your dimensionality of the stocks to clusters, so you reduce the 8631 factors to say 10 or 20 and then use that as a factor in your analysis. I'm also assuming there's some time component to this data so you may need to be working with time series regression models. Otherwise, if you have one point for each mutual fund you definitely cannot use the BY statement.

Your model would end up as:

```
proc glm data=stocks;
class cluster;
model dependent = cluster mktrf smb hmm umd stkmv stkmvew;
run;
```