BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Golf
Pyrite | Level 9

Hello,

   I have tried to estimate the best model using the following commands.

pic1.png

Then I ranking the model by minimum AIC.  The result is shown below.

 

pic3.PNG

 

Based on my understanding, the numbers I highlighted in yellow are the parameters estimated of the model with minumum AIC,  where the independent variables that shown in dot "." were removed.  

I need to test the significant of each coefficients by using following commands.

 

pic4.PNG

The coefficient estimates from this command (shown below) is different of those highlighted in yellow.      Should these numbers highlighted in green be the same as those numbers highlighted in yellow?

 

pic5.PNG

Thank You

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Are there missing values in any of these variables? That may influence how the estimates are calculated under METHOD=ADJRSQ.

--
Paige Miller

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Are there missing values in any of these variables? That may influence how the estimates are calculated under METHOD=ADJRSQ.

--
Paige Miller
sridhar3
Calcite | Level 5

Hi all, 

I am using the following syntax to generate a series of linear regression models using several independent variables. The syntax creates models using all potential combinations of the independent variables. Some of the statistics (like RMSE, SSE, AIC, RSq, AdjRsq) are stored from the parameter estimates table. However, the model number seems to be overridden - it always shows as Model1. Additionally, VIF is shown only on the model that has the best AdjRsq.

I want to be able to generate models using all possible combinations of the independent variables, and also run other diagnostic tests (like heteroscedasticity, normality of residuals etc.) on all the generated models and not on just the best model that the code picks up according to the selection. Any help is thoroughly appreciated. Here is my syntax:

 

Proc Reg Data = Temp_Master OUTEST = temp_model RSQUARE;

   Model Dep1 = IV1 IV2 IV2 / Selection = AdjRsq RSQUARE AIC SSE VIF;

   OUTPUT OUT = model_temp r = Residuals_model_temp;

RUN;

 

A syntax that I am using for testing one of the diagnostic tests mentioned above (normality of residuals) is shown below:

 

Proc Univariate Data = model_temp; var residuals_model_temp; run;

 

SteveDenham
Jade | Level 19

Before you go down this road, consider whether you really want to find a "best" model.  There are tons of posts in this forum that point out the problems with any of the selection methods and criteria.  Since model building is an art, you should consider the physical/biological/psychological processes involved and use that prior knowledge to form hypotheses about the relationship you are testing. Should you merely want the best predictor, use all the variables and their interactions and do something like a classification and regression tree approach.  Pure regression "ought" to have something to do with an assumed causality, otherwise you may as well include number of sunspots observed as a predictor.

 

SteveDenham

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1031 views
  • 4 likes
  • 4 in conversation