BookmarkSubscribeRSS Feed
deleted_user
Not applicable
The REG Procedure
Model: MODEL1
Dependent Variable: score

Analysis of Variance

Sum of Mean
Source DF Squares Square F Value Pr > F

Model 7 168149 24021 11.74 <.0001
Error 96 96501 2046.88323
Corrected Total 103 364650


Root MSE 45.24249 R-Square 0.4611
Dependent Mean 81.98077 Adj R-Sq 0.4218
Coeff Var 55.18672


Parameter Estimates

Parameter Standard
Variable DF Estimate Error t Value Pr > |t|

Intercept 1 22.88491 12.67088 1.81 0.0740
pop 1 -0.17833 0.20712 -0.86 0.3914
gdp 1 -0.00007578 0.00046149 -0.16 0.8699
times 1 2.86822 0.64863 4.42 <.0001
pg 1 0.00009151 0.00002138 4.28 <.0001
pt 1 0.00304 0.00895 0.34 0.7347
gt 1 -0.00002252 0.00002309 -0.98 0.3319
pgt 1 -0.00000226 6.098243E-7 -3.71 0.0003

pg is pop*gdp, pt is pop*times, gt is gdp*times, pgt is pop*gdp*times,
P-value is less than 0.05, but R square is just .46... which is not good?
Do you think this can be a good regression model? if not, Can someone give me some suggestion?
Any suggestion is appreciated!!
8 REPLIES 8
Doc_Duke
Rhodochrosite | Level 12
Lindsey,

There is no "absolute" in "goodness" to a regression model; it depends on your study or discipline. In a tightly controlled laboratory experiment, an R2 of .9 may be unacceptable. In an observational study, an r2 of .20 may be perfectly reasonable. Both scenarios can have statistically significant p-values.

Doc Muhlbaier
Duke
deleted_user
Not applicable
Doc Muhlbaier:
Thank you so much.
So, you mean my R-square value is acceptable, and since the p-value is significant, the rewgression model is ok?
How about if I want to make a little better model based on this? See, some of the p-values are really large, can I try some further work to avoid the unsignificant variable or variable interaction? For example,
"model Y=x1 x2 x3 x1x2 x1x3 x2x3 x1x2x3/method=rsquare" to select some sinificant variables?
Thank you.
Ksharp
Super User
Hi.
Just as Doc@Duke said , it depends your intention and situation for the data you studied.
and the P-value of model is non-sense because it would be enough significant as long as
you add more independent variables.
To get better model ,suggest you use option 'stepwise','backw***'(i can not remember 😞 )
to select the more important independent variables.
And the most important thing is not to forget to check your model's residual to see whether your
regression model is fitted the hypothesis of OLS MODEL.
deleted_user
Not applicable
Thanks, Ksharp. I will try it later.
Paige
Quartz | Level 8
Since different statisticians have different ideas, and do not always agree, let me point out that I would not recommend stepwise regression, as it has known problems.

Furthermore, deleting the insignificant terms may have a minor impact on the quality of the model or a major impact; you just don't know. You may get a significant decrease in the R-squared, or an insignificant decrease. You may get a significant increase in the Adjusted R-squared, or an insignificant increase.

But in the end, I would not go about improving the model by deleting terms. I would examine the residuals, to see if there are indications of curvature or non-linearity. Should there be such indication, I would ADD curvature terms to the model to improve it.
Bill
Quartz | Level 8
In the manufacturing world, I too often see engineers use regression to understand engineering. It's a bit like making soup where one just keeps adding ingredients until the taste goes bad. A cook who knows his/her stuff chooses ingredients that work and then adjusts the quantities to maximize the benefit. Similarly, variables should be selected on the basis of (engineering) knowledge. Then use the regression model to determine the correct magnitude/effect for the variables chosen.
Paige
Quartz | Level 8
> In the manufacturing world, I too often see engineers
> use regression to understand engineering. It's a bit
> like making soup where one just keeps adding
> ingredients until the taste goes bad. A cook who
> knows his/her stuff chooses ingredients that work and
> then adjusts the quantities to maximize the benefit.
> Similarly, variables should be selected on the basis
> of (engineering) knowledge. Then use the regression
> model to determine the correct magnitude/effect for
> the variables chosen.

Ah, the exact opposite of the empirical approach.

And what would you do when you are in a situation where there isn't a lot of engineering understanding of the situation? What would you do when the process moves in ways you have never seen before, and your engineering knowledge can't explain why it did that, but you have a lot of data? What would the cook do in your analogy, when presented with ingredients that he has never seen before?
Ksharp
Super User
Just as you said 'different statisticians have different ideas'. But the regression model only can abstractly discover the relationship between independent variable and dependent variable, because there is no independent variable has no relationship with dependent variable. We need to find the most important independent varibale with dependent,so it is necessary to omit some insignificant independant variables ,and it is not wise to promote
R-squared ---- just as doc@duck said 'when your model has great than .9 with R-squared,
the data would be skeptical' . So it is enough to find several important independent variables with dependent variable.

About 'indications of curvature or non-linearity.' , I think that using 'plot student.*x' statement to find the residual whether ~N (0,sigma^2),
if it ~N (0,sigma^2,then Model fit these data very well, if not can add x^2 or x^3 to see whether enhance the fitness of model and data.
The above all is just my opinion.:-)


Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 684 views
  • 0 likes
  • 5 in conversation