turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- logistic regression: confusion matrix

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-09-2015 09:37 PM

Hi there,

I run a logistic regression with binary outcomes 0 and 1. I obtained the confusion matrix. However the predicted value of 1 is missing. All observations have a predictive value of 0. Looking at the predicted probabilities, the probability that Y = 1 is smaller than Y = 0 for all observations. Does anyone know the reason and how to fix this problem?

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-10-2015 09:00 AM

I don't know what you mean by "fixing the problem." You have data and you specified a model. According to the specified model, P(Y=1) < 0.5 for all observations.

You can try changing the model (easy) or gathering more data (harder), especially for cases where Y=1.

Are there any warning in the SAS log? If you are getting warnings about "quasi-complete separation," you might want to read the paper "Convergence Failures in Logistic Regression" by Paul Allison (2008): http://www2.sas.com/proceedings/forum2008/360-2008.pdf

You can try changing the model (easy) or gathering more data (harder), especially for cases where Y=1.

Are there any warning in the SAS log? If you are getting warnings about "quasi-complete separation," you might want to read the paper "Convergence Failures in Logistic Regression" by Paul Allison (2008): http://www2.sas.com/proceedings/forum2008/360-2008.pdf

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-10-2015 12:17 PM

Thank you for taking the question.

You are right it can't be fixed.The data contains 3 millions of observations with 70,000 missing values ( about 2%) that SAS ignores as usual.

My question is why it would happen even thouh the data definitely has value Y=1. Does it have to do with the predictors?

Thanks again

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-10-2015 12:18 PM

I forgot to mention that there was no problem of convergence. The log file did not display any warning

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-10-2015 01:32 PM

If I were to guess, it would be that the predictors have a very small effect, relative to the constant term in the model. Study the following simulated data. The explanatory makes a relatively small contribution to the linear model. Even though x variable is significant (small p-value), the variable just doesn't have much of an effect. The predicted probabilities are all less than 0.5.

```
data a;
call streaminit(1234);
do i = 1 to 1000;
x = rand("normal");
eta = -1 + 0.15*x;
y = rand("bernoulli", logistic(eta));
output;
end;
run;
proc logist data=a plots(only)=fitplot;
model y(event='1') = x;
run;
```