pg is pop*gdp, pt is pop*times, gt is gdp*times, pgt is pop*gdp*times,
P-value is less than 0.05, but R square is just .46... which is not good?
Do you think this can be a good regression model? if not, Can someone give me some suggestion?
Any suggestion is appreciated!!
There is no "absolute" in "goodness" to a regression model; it depends on your study or discipline. In a tightly controlled laboratory experiment, an R2 of .9 may be unacceptable. In an observational study, an r2 of .20 may be perfectly reasonable. Both scenarios can have statistically significant p-values.
Thank you so much.
So, you mean my R-square value is acceptable, and since the p-value is significant, the rewgression model is ok?
How about if I want to make a little better model based on this? See, some of the p-values are really large, can I try some further work to avoid the unsignificant variable or variable interaction? For example,
"model Y=x1 x2 x3 x1x2 x1x3 x2x3 x1x2x3/method=rsquare" to select some sinificant variables?
Just as Doc@Duke said , it depends your intention and situation for the data you studied.
and the P-value of model is non-sense because it would be enough significant as long as
you add more independent variables.
To get better model ,suggest you use option 'stepwise','backw***'(i can not remember :-( )
to select the more important independent variables.
And the most important thing is not to forget to check your model's residual to see whether your
regression model is fitted the hypothesis of OLS MODEL.
Since different statisticians have different ideas, and do not always agree, let me point out that I would not recommend stepwise regression, as it has known problems.
Furthermore, deleting the insignificant terms may have a minor impact on the quality of the model or a major impact; you just don't know. You may get a significant decrease in the R-squared, or an insignificant decrease. You may get a significant increase in the Adjusted R-squared, or an insignificant increase.
But in the end, I would not go about improving the model by deleting terms. I would examine the residuals, to see if there are indications of curvature or non-linearity. Should there be such indication, I would ADD curvature terms to the model to improve it.
In the manufacturing world, I too often see engineers use regression to understand engineering. It's a bit like making soup where one just keeps adding ingredients until the taste goes bad. A cook who knows his/her stuff chooses ingredients that work and then adjusts the quantities to maximize the benefit. Similarly, variables should be selected on the basis of (engineering) knowledge. Then use the regression model to determine the correct magnitude/effect for the variables chosen.
> In the manufacturing world, I too often see engineers
> use regression to understand engineering. It's a bit
> like making soup where one just keeps adding
> ingredients until the taste goes bad. A cook who
> knows his/her stuff chooses ingredients that work and
> then adjusts the quantities to maximize the benefit.
> Similarly, variables should be selected on the basis
> of (engineering) knowledge. Then use the regression
> model to determine the correct magnitude/effect for
> the variables chosen.
Ah, the exact opposite of the empirical approach.
And what would you do when you are in a situation where there isn't a lot of engineering understanding of the situation? What would you do when the process moves in ways you have never seen before, and your engineering knowledge can't explain why it did that, but you have a lot of data? What would the cook do in your analogy, when presented with ingredients that he has never seen before?
Just as you said 'different statisticians have different ideas'. But the regression model only can abstractly discover the relationship between independent variable and dependent variable, because there is no independent variable has no relationship with dependent variable. We need to find the most important independent varibale with dependent,so it is necessary to omit some insignificant independant variables ,and it is not wise to promote
R-squared ---- just as doc@duck said 'when your model has great than .9 with R-squared,
the data would be skeptical' . So it is enough to find several important independent variables with dependent variable.
About 'indications of curvature or non-linearity.' , I think that using 'plot student.*x' statement to find the residual whether ~N (0,sigma^2),
if it ~N (0,sigma^2,then Model fit these data very well, if not can add x^2 or x^3 to see whether enhance the fitness of model and data.
The above all is just my opinion.:-)