turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Linear regression - prediction

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-16-2015 11:05 AM

Hello everyone,

I have a situation where three X variables are in "strange" curvilinear relationship with Y var., but, when i put those X vars. as predictors in multiple linear regression, as result i get high value of Adj.R Square, all X vars. are stat. significant (p<0,001) and all assumptions for linear regression are met except of linearity. How is it possible that the model is still good, despite the fact that the assumption of linearity is not fulfilled?

tnx.

Tomislav

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Tommy1201

03-16-2015 11:17 AM

Because you are able to fit a linear model with high Adj R. Square does not imply that this is the best model ... you should get a better Adj R. Square if you fit the curvature properly.

The curvature may produce data that slopes up or slopes down, and hence the appearance of a good fit and significant coefficients.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Tommy1201

03-16-2015 04:11 PM

Here's an example to look at:

data junk;

do x= 0 to 10 by .1;

y = x + cos(x);

output;

end;

run;

Where x has a somewhat "strange" curvilinear relationship with y but the R-square is 0.9385 and the parameter for x has a p-value < 0.0001;

The data bounces back and forth across the line y=x, not very far as all of the values are within the 95% prediction interval for individual values.