BookmarkSubscribeRSS Feed
DKotch22
Calcite | Level 5

Any help is much appreciated!

 

I was required to run a code for class with a provided dataset to find the 5 best linear regression models in terms of AIC. This is my output:

3a.PNG

 

I understand the top 5 are my 5 best models, but there is not an Intercept listed. Is this possible to not have a y-intercept in the model? I was trying to read up on this, but there seems to be some conflicting opinions.

 

Thank you in advance for the help!

 

Dan

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Yes, it is possible to have no intercept, effectively it means that when all x variables = 0, that y=0.

 

Of course, the next question to ask is: is it a good idea to fit models with no intercepts. My answer is that I am usually skeptical when someone fits a model with no intercept, without strong justification for doing so.

--
Paige Miller
DKotch22
Calcite | Level 5

Thank you!

 

So when I used AIC to find the "best 5 models", it's telling me the models that are the most "significant" are those that don't have a y-intercept? Could that mean that the y-intercept may not be very statistically significant itself and therefore the "best models" don't include it?

 

Thanks!

DKotch22
Calcite | Level 5

Also, when using Backward Elimination Method, Forward Selection Method, or Stepwise Selection method this was the "best" model:

 

3b.PNG

 

All three of these methods were the same with a y-intercept (intercept, x1, x6, x9). The best method for AIC was the same except for no intercept (x1,x6,x9).

 

This is what's confusing me; not sure why the intercept is left out for the models in AIC....

PaigeMiller
Diamond | Level 26

Since you don't show us your code, we don't know why there is not intercept in your models; all I can say is that I think it is a very poor idea to leave out the intercept without strong justification.

when I used AIC to find the "best 5 models", it's telling me the models that are the most "significant" are those that don't have a y-intercept? Could that mean that the y-intercept may not be very statistically significant itself and therefore the "best models" don't include it?

 

No, I don't think that's what it is saying, at least using the statistical meaning of "significant". It means that using AIC, SAS chose models that did not have the intercept as a model term. AIC and "significance" in the statistical sense are not the same thing.

 

I am also usually opposed to any form of stepwise regression, as you can read on the internet dozens of people writing about dozens of drawbacks regarding this method.

--
Paige Miller
Ksharp
Super User

I would suggest to use PROC PLS to pick up the significant variables ,which would not become over-fit model .

Check the example of PROC PLS in documentation.

PaigeMiller
Diamond | Level 26

@Ksharp wrote:

I would suggest to use PROC PLS to pick up the significant variables ,which would not become over-fit model .

Check the example of PROC PLS in documentation.


I wish I had said that ... Smiley Tongue

--
Paige Miller

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1565 views
  • 1 like
  • 3 in conversation