Hi,
I have a training set for a binary classification problem that I balanced 50:50 for Responder = 1 or 0 respectively. I have also referred to this Note (https://support.sas.com/kb/22/601.html) about including the Offset variable because my data's true proportion is about 10:90 instead of 50:50. The code is below,
ods graphics on;
proc logistic
Data = work.train_stdize
outmodel=work.mymodel
outest=work.mdl_betas
namelen=32;
class &class_var. / param=ref;
model responder(event='1') = &class_var. &num_var. / stb lackfit ctable pprob=0.5 offset=off;
score data=work.train_stdize fitstat out=work.trainpred outroc=work.troc;
run;
ods graphics off;
I get the warning message,
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
WARNING: The information matrix is singular and thus the convergence is questionable.
If I exclude the offset=off in the option and rerun the logistic regression code, I don't get this warning message?! I checked all my class and numeric variables and can confirm there are no collinearity.
Any ideas on how I can fix this or just ignore the warning message?
Thanks,
Lobbie
The note you referenced makes it clear that the WEIGHT option is generally superior to the offset method. I believe that you already have all the ingredients to create a weight variable, since these are included in the calculation of the offset variable 'off' (r sub1 and p sub1). Have you tried that approach? If you get the same sort of error then the problem might be that there is some sort of collinearity or quasi-separation in the &groupvar and &numvar variables.
SteveDenham
The note you referenced makes it clear that the WEIGHT option is generally superior to the offset method. I believe that you already have all the ingredients to create a weight variable, since these are included in the calculation of the offset variable 'off' (r sub1 and p sub1). Have you tried that approach? If you get the same sort of error then the problem might be that there is some sort of collinearity or quasi-separation in the &groupvar and &numvar variables.
SteveDenham
Hi @SteveDenham,
I did try using weight and am still getting the warning message. While I can confirm there is no collinearity, it is possible there are some sort of quasi-separation issue in the Class variables like you suggested. Questions are,
ods graphics on;
proc logistic
Data = work.train_stdize
outmodel=work.mymodel
outest=work.mdl_betas
namelen=32;
class &class_var. / param=ref;
model responder(event='1') = &class_var. &num_var. / stb lackfit ctable pprob=0.5 offset=off;
score data=work.train_stdize fitstat out=work.trainpred outroc=work.troc priorevent=&pi1;
score data=work.validate_stdize fitstat out=work.validpred outroc=work.vroc priorevent=&pi1;
score data=work.osd_stdize fitstat out=work.osdpred outroc=work.oroc priorevent=&pi1;
run;
ods graphics off;
Thanks,
Lobbie
Since the error message still appears using the WEIGHT statement, this has moved beyond my experience. Get with the folks at Tech Support - they should be able to decipher what is going on, and how to get the correction for oversampling done correctly.
SteveDenham
Hi @SteveDenham,
I tried again with some modifications to the Class variables with Weight method, and it worked now. I also notice when I used the Offset method, the DF of one of the levels in a categorical variable is 0 or NULL, and this may be the cause of the warning message.
All good and thanks for your help,
Lobbie
@Ksharp , I tried using PEVENT= as you suggested. It did not change the values of the intercept and the coefficients. It also did not adjust the predicted probabilities based on prior. It did give me a classification table from 0% to 100% just as the documentation stated. However, it is not what I am after. Thanks.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.