09-09-2015 09:37 PM
I run a logistic regression with binary outcomes 0 and 1. I obtained the confusion matrix. However the predicted value of 1 is missing. All observations have a predictive value of 0. Looking at the predicted probabilities, the probability that Y = 1 is smaller than Y = 0 for all observations. Does anyone know the reason and how to fix this problem?
09-10-2015 09:00 AM
09-10-2015 12:17 PM
Thank you for taking the question.
You are right it can't be fixed.The data contains 3 millions of observations with 70,000 missing values ( about 2%) that SAS ignores as usual.
My question is why it would happen even thouh the data definitely has value Y=1. Does it have to do with the predictors?
09-10-2015 01:32 PM
If I were to guess, it would be that the predictors have a very small effect, relative to the constant term in the model. Study the following simulated data. The explanatory makes a relatively small contribution to the linear model. Even though x variable is significant (small p-value), the variable just doesn't have much of an effect. The predicted probabilities are all less than 0.5.
data a; call streaminit(1234); do i = 1 to 1000; x = rand("normal"); eta = -1 + 0.15*x; y = rand("bernoulli", logistic(eta)); output; end; run; proc logist data=a plots(only)=fitplot; model y(event='1') = x; run;