01-17-2012 10:14 PM
I performed logistic regression on my data. The results show that Hosmer and Lemeshow Goodness-of-Fit Test, Global Null Hypothesis test and Analysis of Parameter Estimates are all significant. But the value of R^square is only 0.176. What does that means? Is My regression model valid?
01-17-2012 11:57 PM
The null hypothesis you are testing is that the parameter estimate = 0. That is all statistical significance means, that if the population value is 0 you would be expected to get these results less than 5% of the time. Significance is a function of both your sample size and variance. In brief, if you have a large number of people or a small population variance, your obtained value can be very close to zero and still statistically significant.
So, significant does not mean large, it just means probably not zero.
I'm also interested that you consider .176 a small value for explained variance. How many variables do you have in your equation? What is your dependent variable? For most things in life, if I could explain 18% of the variance through a few variables I'd be so happy I would be tap-dancing. In reality, what four variables predicted your decision to post on this forum (oh, excuse me, community) or my decision to answer it?
There is certainly nothing to say that a model cannot have a pseudo-R2 of .176 and significant goodness of fit tests.
01-18-2012 12:12 AM
In my opinion.
R^square means nothing for logistic model.
Because R^square is calculated based on Normal Distribution,
whereas logistic model use logistic Distribution.
Also you can't do some Regression Test like Linear Regression.
01-18-2012 01:09 AM
While it is strictly true that logistic regression does not give you an r-squared calculated the same as in ordinary least squares regression, you can get a pseudo- R2 using proc logistic. See here for example and a good explanation.
SAS gives the likelihood-based pseudo R-square measure and its rescaled measure. Categorical Data Analysis Using The SAS System, by M. Stokes, C. Davis and G. Koch offers more details on how the generalized R-square measures that you can request are constructed and how to interpret them.
proc logistic data = hsb2;
class prog(ref='1') /param = ref;
model hiwrite(event='1') = female prog read math / rsq lackfit;
01-23-2012 06:41 AM
The H-L goodness of fit test tests something different from the overall model fit test. You want the H-L test to be non-significant, or, more precisely, you want it to be small. A large value of H-L indicates a problem with your model. SAS prints a table with details.
The overall model test says whether your null can be rejected. But be careful; statistical significance does NOT mean what many think it means. It is NOT the likelihood of the parameters being 0, it is the probability of getting results as extreme or more extreme as you got in a sample of your size drawn from a population where the parameter is 0. This is rarely a useful question.
Whether a pseudo R2 of .18 is "large" depends on the field. In social sciences, it is pretty darn good. In physics, it would be lousy.
All of which illustrates the point that it is hard to answer a question like this sensibly without context.