Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Wald Chi Square statistics - Logistic Regression

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-07-2019 02:35 AM
(18298 views)

Hi,

I have doubt in Logistic regression. The significance of variables is tested using Wald chi square statistics and corresponding p- value.

Wald Chi Square Statistisc = (Estimate / Std Error)^2

The null hypothesis is tested using Chi Square distribution. I am not clear why we use Chi Square and not t-statistics like in Linear regression. I know that estimation technique in Logistic regression is maximum likelihood and in linear regression it is OLS, how does this affects the choice of distribution for testing significance of variable.

Thanks,

Vishal

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The description of the hypothesis test for the regression coefficients is explained in the documentation.

You can compare it to the hypothesis test for the linear regression.

Briefly, the LOGISTIC procedure tests the quadratic form directly, which is distributed as chi-square.

The REG procedure puts the quadratic form in the **numerator** and puts a sample variance statistic in the **denominator **and tests the ratio. The ratio of two chi-square RVs is an F, which explains the difference.

17 REPLIES 17

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

(1) Logistic regression is a case where the outputs are discrete (mostly there are two outcomes as in binary logistic regression problems). However in linear regression the outcome is continuous and can take any value. The former is frequency dependent while the later is mean dependent comparisons.

(2) You use Chi-square statistics when the observations are coming from a Chi-square distribution while you use t-statistics when the observations are coming from a t-distribution.

Probably these are the reason why we use different statistics depending upon the problem at hand.

Is this what you wanted to know?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks koyelghosh,

Agree to your points target is continuous in Linear and discrete in Logistics. I would like to add few points here:

1. Chi Square statistics = ((Beta - 0)/ Std error)^2, here beta is the coefficient which we are testing against the null hypothesis that it is 0. The part of formula (Beta - 0)/ Std error), is same as for t-statistics. I agree to the point that target variable is discrete , however Beta is coming from a population which is continuous (can be -ve/+ve) that's why it is standardized. Why don't we then compare to a t distribution, rather than squaring it and then comparing it to Chi square distribution (which is the square of a random number).

What you are saying in the point 'You use Chi-square statistics when the observations are coming from a Chi-square distribution while you use t-statistics when the observations are coming from a t-distribution.' is true for target variable , however we are checking the coefficients which not necessarily maybe from a Chi-square distribution.

2. Even in Logistic regression the target variable is transformed using Logit function in to a continuous variable (-infinity to infinity). It is actually a generalized linear model.So , why cannot a t-distribution be used for checking the significance of coefficient.

Thanks,

Vishal

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The square of a continuous and normally distributed statistic is distributed as Chi-Squared.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@vishal_prof_gmail_com . Sorry I could not reply earlier. Had a busy day. Rick_SAS and PaigeMiller have already given you the answer. I have nothing new to add. I am answering only because the question was referring to me.

1. Chi Square statistics = ((Beta - 0)/ Std error)^2, here beta is the coefficient which we are testing against the null hypothesis that it is 0. The part of formula (Beta - 0)/ Std error), is same as for t-statistics. I agree to the point that target variable is discrete , however Beta is coming from a population which is continuous (can be -ve/+ve) that's why it is standardized. Why don't we then compare to a t distribution, rather than squaring it and then comparing it to Chi square distribution (which is the square of a random number).

You are right when you say that X^2 is (Beta/Std.Error)^2 and it looks very much like t-statistics, except for the square term. So much so that square root of X^2 is also called Psuedo t-ratio (see here)! But why Pesudo when they actually look very similar and why can't one use t-statistics for logisitic regression? You have an excellent question!

My answer would go like this.

The assumptions about the population should go before we carry out any statistical procedure. I don't think it is a good idea to reverse the flow of logic (that is carry out a test first and then do assumptions about population later. I guess that is how the statistics work). The assumption with t-statistics is that the population is **approximately** normal looking (which is actually t-distribution) and the **population parameters are unknown**. However when you are doing Logistic regression, **the population parameters are known** (actually it can be shown that variance and its related mean are known). For example in binary logistic regression, the expected value E(Y) = n*p and Var(Y) = n*p*(1-p), where n=number of data-points, p=probability of success (in case of coin flip for example it is 0.5 but it can be anything between 0 and 1). This seemingly simple difference put different constraints on the tests that we can carry out.

What you are saying in the point 'You use Chi-square statistics when the observations are coming from a Chi-square distribution while you use t-statistics when the observations are coming from a t-distribution.' is true for target variable , however we are checking the coefficients which not necessarily maybe from a Chi-square distribution.

and

2. Even in Logistic regression the target variable is transformed using Logit function in to a continuous variable (-infinity to infinity). It is actually a generalized linear model.

The transformation that you are talking about helps us to fit and visualize the data. To begin with we do not have a continuous data. As in binary logisitic regression, it is only 0 or 1 (or dead/alive, cancer/not cancer, loan defaulted/not defaulted etc.). Transformations helps us to understand and predict but it does not alter the underlying original data and thus the distribution. Coefficients are associated with that transformation.

I tried to keep things simple but I am sorry if I made it confusing rather.

Best wishes,

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi

I had a question related to this and I really appreciate your help.

I have a situation where I have 2 equations

Y1 = B1 + B1X + ....

Y2 = B2 + B2X+ .....

as you note the dependent variables are different (Y1 and Y2) while the independent variables in question (X) is the same. I have run regressions and have their respective test statistics and betas for the 2 equations

Here, I want to *claim* that B1X is *significantly different* from B2X in the second equation and after going through this post, I am using the formula

**Chi Sq = [(B1X - B2X) / (S.E.1 - S.E.2)] ^2 **

Is this correct? If you could direct me to a paper or book that talks about this, I would really appreciate that. Thanks a lot!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Vishal,

I had a question related to this and I really appreciate your help.

I have a situation where I have 2 equations

Y1 = B1 + B1X + ....

Y2 = B2 + B2X+ .....

as you note the dependent variables are different (Y1 and Y2) while the independent variables in question (X) is the same. I have run regressions and have their respective test statistics and betas for the 2 equations

Here, I want to *claim* that B1X is *significantly different* from B2X in the second equation and after going through this post, I am using the formula

**Chi Sq = [(B1X - B2X) / (S.E.1 - S.E.2)] ^2 **

Is this correct? If you could direct me to a paper or book that talks about this, I would really appreciate that. Thanks a lot!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The description of the hypothesis test for the regression coefficients is explained in the documentation.

You can compare it to the hypothesis test for the linear regression.

Briefly, the LOGISTIC procedure tests the quadratic form directly, which is distributed as chi-square.

The REG procedure puts the quadratic form in the **numerator** and puts a sample variance statistic in the **denominator **and tests the ratio. The ratio of two chi-square RVs is an F, which explains the difference.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You can use wald statistics, and likelihood ratio test that have asymptotically chi-squared distributions in linear regression. But, when data is normal distributed, then it is possible to use the exact distributions (not relying on asymptotic results). Therefore, you use t-statistics and F-test in linear regression as it is more exact. Actually, if you use proc genmod instead of proc glm/proc mixed for normal distributed data then you will get the wald and chi-square statistics.

In logistic regression it is not possible (or in best case very difficult) to find test statistics with a known exact distribution, therefore you use chi-square and wald statistics because then you at least know their asymptotic distribution. And actually, n doesnt need to be very large before the chi-square statistic are practically indistinguishable from a χ^{2}distribution.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks @JacobSimonsen @koyelghosh @Rick_SAS @StatDave @PaigeMiller

Agree to the point that in Linear regression the target variable is continuous and we use F statistics while in Logistic Regression the target variable is binary and we use Chi Square distribution. This is about testing the significance of the model, wherein we compare a Null model and a model with covariates.

However my question is about the significance of model coefficients. Let me put it this way,

Consider and Logistic regression mode:

Log[(1-p)/p] = Intercept + B1X1 + B2X2 + ERROR.

In this model p is the probability of an event (say Loan default). Event will have binary values. Now for testing the significance of the model we Chi- Square ratio , Likelihood ratio. The coefficient X1 can have any value from -infinity to +infinity, we can say it is coming from a continuous population, then why cannot I use a t-distribution for testing it. To make my question more clear, consider a Linear model:

Y = Intercept +B1X1 + B2X2 + ERROR

The difference between these two models is the target variable and method of finding the coefficients. For Linear it is OLS,while Logistic it is MLE. The values of B1, B2 have same distribution for Linear and Logistic, so why T distribution for Linear and Chi Square for Logistic.

I think the answer lies in the distribution of target variable which we are talking about, we need to frame it more objectively. Infact how the distribution of model coefficients will by impacted by target variable or approach(OLS/MLE) needs to be answered.

Regards,

Vishal

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@vishal_prof_gmail_com wrote:

The coefficient X1 can have any value from -infinity to +infinity, we can say it is coming from a continuous population, then why cannot I use a t-distribution for testing it.

Because it is not a t-distribution when the response is binary. As stated by @StatDave, if you take the square root of the statistic, then you have a t-distribution, if that's what you really want.

There are probably text books and web sites that go through the mathematics of this whole thing, those are better places to look for answers.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

"Because it is not a t-distribution when the response is binary.."

In real world, a variable never follows a t-distribution. The variable is standardized and then compared to a t-distribution. I think that's not the right justification.

" As stated by @StatDave_sas, if you take the square root of the statistic, then you have a t-distribution, if that's what you really want". This is not what I am asking. My question is , why are we using square of [(x-0)/Std error]. How is this derived?

"There are probably text books and web sites that go through the mathematics of this whole thing, those are better places to look for answers."

I have been trying to find the answer to this question since quite some time. What you have mentioned above is the answer I have received mostly. People generally answer by using complicated terms like asymptotic Chi-square distribution , Pseudo statistics, etc.

If you can forward refer some book/website which may help finding this answer, it would be very helpfull.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

To make the question more objective.

In Logistic regression the estimates (coefficients) should be such that they maximize the likelihood of observing the data. Then how do we conclude that they will be coming from a Chi Square distribution and the Chi Square statistic would be ((Beta - 0 )/ Std Error)^2.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.