turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- a question regarding statistics

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-29-2013 09:17 PM

hi i have estimated regression using 'Proc REg' with' by' variable.

now i regression result for all the 'by variable'

i want to average the slope coefficient. well that is easy . but what about to the t statistic? can i simply average it too? and how to interpret it?

Thanks in advance

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

03-29-2013 09:53 PM

No, you can't just average the t-stat or p-value.

What are you actually trying to calculate?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

03-29-2013 11:21 PM

ok thanks for the reply , here is what i am trying to calculate.

i have weekly liquidity data for 500 stock over 7 years.

i am estimating an equation, which is something like this

Liq= a+ b1x + b2z+ b3d........+bnZ +e

now i have estimated this equation using proc reg .

with stockname as my by variable. so i have estimated this equation 500 times (number of stocks)

for reporting purpose i need to calculate the cross sectional average of coefficients (e.g average 500 b1). How do i report the t statistics for these averaged coefficients??? .

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

03-30-2013 01:23 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

03-30-2013 03:35 PM

I agree with Reeza that averaging your coefficients is NOT the way to go. Instead, you should estimate single slope coefficients from all your stocks taken together. Drop the BY clause, switch from REG to GLM and use something like:

**proc glm data=myData;**

**class stockName;**

**model liq = stockName x z d / solution;**

**run;**

This will estimate a separate intercept for each stock and single slopes (with T statistics) for all your parameters (x, z, d, etc)

PG

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

03-30-2013 04:10 PM

If you have time series data, data over 7 years, then its likely you should be doing some sort of time series analysis rather than proc reg or GLM in my opinion.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

03-30-2013 05:20 PM

but this procedure has been used in alot of recent papers in top journals, and they just average out the slope coefficients, however how they go about tstatistics, i am not sure and i cant understand

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

03-30-2013 06:34 PM

That explains why the market crashed

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

03-30-2013 09:26 PM

haha nice one :smileysilly: but you know the problem is not with estimtaion, because this estimation becuase this is just an intermediate estimation before the real model, however we cannot report the result of 500 regression, so just for reporting purpose this has to be done, and i just cant figure out how they have done or how to report, if you want i can send you a link of the orignal paper, sorry to bother but any help wpold be really appreciated

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

03-30-2013 10:05 PM

Ahmad,

It could help to see the paper you are referring to.

Maybe there is some misunderstanding on the methodology used?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AncaTilea

03-30-2013 10:48 PM

here let me attach the paper , see table2 on page 266, the author has just calculated equally weighted average of the coefficients

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

03-31-2013 01:36 PM

On pg 266, that is time series analysis, not just regression analysis. Because there is seasonal adjustment and time adjustments.

You can implement a similar model in proc reg, but have to make sure your have the appropriate terms in the model as well.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

03-31-2013 01:43 PM

yes reeza i have all the appropriate terms, i.e lags and everything, but do you know how the author has summarized the results in table2, specially regarding the tstatistic, as he has averaged the coefficients cross sectionally, but how do i report t statistics??

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ahmad

04-01-2013 10:55 AM

I'm not sure, mostly because I don't want to read the paper thoroughly.

I would suggest contacting the authors directly. The version you attached doesn't have the author contacts, but usually when I've had articles published the author contacts are included, as well as the institutions.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-01-2013 12:52 PM

I'm not sure how the author came up with the average t-statistic. I agree that it is a bad practice averaging t-statistics. If you have the opportunity to recommend a different solution, I would probably go for something like "% of regressions with a significan p-value". It's a way to say, for each independent variable, of the n "by groups", x% had a significan p-value.