turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- P-value, median coefficients, system of equations

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-05-2012 05:13 PM

Hello,

I am running a system of 28 equations, using OLS. I wish to determine the significance of the median coefficients by calculating the associated p-values.

What standard deviation should I use?

To give more details:

My sample is partitioned into 28 industries, I thus estimate:

Y1 = a1 + b1X1 + c1W1 + e1

...

y28=a28 +b28X28 + c28W28 + e28

I want to assess how significantly different from zero is median(X), and similarly for W and a.

So my p-value should be p-value=2*(1-prob((med(x)-0))/std).

Problem is, what is the correct standard deviation?

Thank you!

Accepted Solutions

Solution

06-06-2012
07:52 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2012 07:52 AM

Big assumption 1 to make any of this even close: The coefficients have a normal (or nearly normal) distribution.

Big assumption 2: You want the distribution of the coefficients, rather than the X's and W's.

If these are met, then the distribution of the coefficients has an expected value of mu (mean) and a variance of pi*sigma^2/(4*m), where m is the number of observations. Plugging in the 28 you have for the number of observations, I get something like: 0.16748114642266969311508887484349 * sigma hat, as the standard error you want to put in for std, where sigma hat is the estimate of the standard deviation of the sample of coefficients.

See http://web.williams.edu/go/math/sjmiller/public_html/BrownClasses/162/Handouts/MedianThm04.pdf (page 4).

But all this really makes me curious--you probably wouldn't look at the median as an estimator unless there was pretty strong evidence that the distribution was something that deviates substantially from normal. Consequently, the estimator given here is going to be off, and you don't know how much, thus the p values obtained are sketchy at best.

If you are truly concerned about the distribution, why not look at the sample distribution and select the observations at the 2.5th and 97.5th percentiles to form a nonparametric confidence interval. With only 28 observations, these would be the minimum and maximum.

Now if all this is about the distributions of the independent variables, rather than the coefficients, you could make the change in the denominator of the variance estimator to get a standard error, but it might just be easier to do a median test on the values and get a p value.

Good luck, and I hope this helps some.

Steve Denham

All Replies

Solution

06-06-2012
07:52 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2012 07:52 AM

Big assumption 1 to make any of this even close: The coefficients have a normal (or nearly normal) distribution.

Big assumption 2: You want the distribution of the coefficients, rather than the X's and W's.

If these are met, then the distribution of the coefficients has an expected value of mu (mean) and a variance of pi*sigma^2/(4*m), where m is the number of observations. Plugging in the 28 you have for the number of observations, I get something like: 0.16748114642266969311508887484349 * sigma hat, as the standard error you want to put in for std, where sigma hat is the estimate of the standard deviation of the sample of coefficients.

See http://web.williams.edu/go/math/sjmiller/public_html/BrownClasses/162/Handouts/MedianThm04.pdf (page 4).

But all this really makes me curious--you probably wouldn't look at the median as an estimator unless there was pretty strong evidence that the distribution was something that deviates substantially from normal. Consequently, the estimator given here is going to be off, and you don't know how much, thus the p values obtained are sketchy at best.

If you are truly concerned about the distribution, why not look at the sample distribution and select the observations at the 2.5th and 97.5th percentiles to form a nonparametric confidence interval. With only 28 observations, these would be the minimum and maximum.

Now if all this is about the distributions of the independent variables, rather than the coefficients, you could make the change in the denominator of the variance estimator to get a standard error, but it might just be easier to do a median test on the values and get a p value.

Good luck, and I hope this helps some.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2012 08:40 AM

Thank you!

I know that the above makes BIG assumptions, but I am trying to replicate someone's results... I'll see if I need to correct these assumptions later.

But your answer does help me, thanks again!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-18-2012 06:51 AM

Why not approach the problem by pooling the 28 industries? That way, the b's and c's would be lumped into two t-statistics that would automatically tell you if they are significantly different from zero.