Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Logistic Regression convergence

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-29-2017 08:21 PM
(3028 views)

I have run a huge logistic regression with about 900 independant variables in the model. All variable sin the model, including the dependant are binary 1 or 0.

The log states that:

WARNING: The information matrix is singular and thus the convergence is questionable.

while also stating that:

NOTE: Convergence criterion (GCONV=1E-8) satisfied.

I am just using this model to identify potentially significant independant factors that predict the dependant outcome; I will then use those significant variables in further modeling with other covariates not included in this model.

Therefore, do I have to fix this issue by potentially removing variables (perhaps there is collinearity?), or can I rely on the model?

It would be difficult to try and pick and choose from 900 variables.

7 REPLIES 7

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

sasnewbie12 wrote:

Therefore, do I have to fix this issue by potentially removing variables (perhaps there is collinearity?), or can I rely on the model?

It would be difficult to try and pick and choose from 900 variables.

Not "perhaps". There is collinearity. As in, one (or more) of the 900 variables is a perfect linear combination of the others.

I wouldn't do this. Even if you can trust the model (which you probably can't), logistic regression is a poor choice of technique when you have 900 correlated variables.

Better you should use a technique which is much less affected by the presence of collinearity. That method is Partial Least Squares regression, which in SAS is PROC PLS.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This is also survey data. I don' t think there is any proc for PLS with survey data.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

That doesn't change any of my comments. Logistic regression in this case is a nightmare. The collinearity will make your results meaningless.

You could modify the data to weight things as the survey requires, and then run PROC PLS.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

In my opinion, HPGENSELECT fails for the same reason as LOGISTIC, it is not meant to account for the collinearity of the 900 variables. Forward and stepwise methods are widely regarded by the statistical community as having major drawbacks.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

There are other selection method like LASSO, CV ..... in PROC HPGENSELECT .

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

With regards to Lasso, there is this long thread in which many people think Lasso is not a good choice with large number of correlated variables. https://stats.stackexchange.com/questions/7935/what-are-disadvantages-of-using-the-lasso-for-variabl...

I don't know enough about CV to comment.

--

Paige Miller

Paige Miller

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.