turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Get statistics (AIC's) of all models of a stepwise...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 06:19 AM

Hello,

How can I get the AIC's of all models of stepwise regression? If I use this code, I only get the result for the final model:

**Data** Input (Drop=i j);

Array X{*} X1-X500;

Do j=**1** To **140**;

X1=Rannor(**1**);

X2=Rannor(**1**);

Y=**2**+X1***3**-X2***4**+Rannor(**1**)-**0.5**;

Do i=**3** To **500**;

X{i}=Rannor(**1**);

End;

Output;

End;

**Run**;

**Proc** **Reg** Data=Input OutEst=Result;

Model Y = X1-X500 / Selection=Forward AIC BIC;

**Run**;

Thanks & kind regards

Accepted Solutions

Solution

03-01-2016
09:36 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 07:34 AM

Here's sample code for PROC GLMSELECT:

```
proc glmselect data=input;
model y = x1-x5 / selection=forward(select=sl) stats=bic details=all;
run;
```

The sub-option SELECT=SL specifies that variable selection is based on the significance level of the F statistic (similar to PROC REG, the default would be different: SBC). Option STATS=BIC includes the BIC in the output. AIC is included by default. DETAILS=ALL requests fit statistics and many other details about the models at each step of the variable selection process.

I've reduced the number of independent variables to 5 just for demonstration. Having more explanatory effects (500) than observations (140) in the analysis dataset would not be sensible for the general linear model anyway.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 06:49 AM

You can try use PROC HPGENSELECT.

```
proc hpgenselect data=input;
Model Y = X1-X50/dist=normal link=id;
Selection method=forward(choose=aic);
Run;
```

It shows the AIC value from each model. It also choose the model based on the AIC. Unfortunately I couldn't get it to work with 500 variables (error message due to resource problems) so I only included the first 50 variables.

Effect Number p

Step Entered Effects In AIC Value

0 Intercept 1 877.2496 .

------------------------------------------------------------

1 x2 2 760.4677 <.0001

2 x1 3 413.6365 <.0001

3 x39 4 406.7601 0.0034

4 x47 5 400.3529 0.0043

5 x17 6 395.3228* 0.0088

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 07:11 AM

I'm sorry for the typos in my code. If I copy-paste something (I think including squiggly brackets), the editor kind of "self-destructs".

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 07:20 AM

`ods output SelParmEst=SelParmEst;`

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 07:22 AM

You shouldn't be using stepwise to build models - the results are wrong see e.g. Stopping Stepwise

However, if you still want this, you can use GLMSELECT and use the DETAILS = FITSTATISTICS on the MODEL statement.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 08:30 AM

@plf515

I agree partly ...and also disagree partly...

Stepwise method have tendency to include too many variables. But, if it is clear when the results is reported that the associations was found by model section and not by testing well defined hypthoses, then there is no problem. Or, if the variable selection is done on a training dataset to generate hypothis, which then is tested on an other dataset it is also a valid approach.

I agree partly ...and also disagree partly...

Stepwise method have tendency to include too many variables. But, if it is clear when the results is reported that the associations was found by model section and not by testing well defined hypthoses, then there is no problem. Or, if the variable selection is done on a training dataset to generate hypothis, which then is tested on an other dataset it is also a valid approach.

Solution

03-01-2016
09:36 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-01-2016 07:34 AM

Here's sample code for PROC GLMSELECT:

```
proc glmselect data=input;
model y = x1-x5 / selection=forward(select=sl) stats=bic details=all;
run;
```

The sub-option SELECT=SL specifies that variable selection is based on the significance level of the F statistic (similar to PROC REG, the default would be different: SBC). Option STATS=BIC includes the BIC in the output. AIC is included by default. DETAILS=ALL requests fit statistics and many other details about the models at each step of the variable selection process.

I've reduced the number of independent variables to 5 just for demonstration. Having more explanatory effects (500) than observations (140) in the analysis dataset would not be sensible for the general linear model anyway.