BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lobbie
Obsidian | Level 7

Hi,

 

I have a training set for a binary classification problem that I balanced 50:50 for Responder = 1 or 0 respectively.   I have also referred to this Note (https://support.sas.com/kb/22/601.html)  about including the Offset variable because my data's true proportion is about 10:90 instead of 50:50.  The code is below,

ods graphics on;
proc logistic 
	Data = work.train_stdize 
	outmodel=work.mymodel 
	outest=work.mdl_betas 
	namelen=32;
	class &class_var. / param=ref;
	model responder(event='1') = &class_var. &num_var. / stb lackfit ctable pprob=0.5 offset=off;
	score data=work.train_stdize fitstat out=work.trainpred outroc=work.troc;
run;
ods graphics off;

I get the warning message,

NOTE: Convergence criterion (GCONV=1E-8) satisfied.
WARNING: The information matrix is singular and thus the convergence is questionable.

If I exclude the offset=off in the option and rerun the logistic regression code, I don't get this warning message?!  I checked all my class and numeric variables and can confirm there are no collinearity.

 

Any ideas on how I can fix this or just ignore the warning message?

 

Thanks,

Lobbie

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

The note you referenced makes it clear that the WEIGHT option is generally superior to the offset method.  I believe that you already have all the ingredients to create a weight variable, since these are included in the calculation of the offset variable 'off' (r sub1 and p sub1).  Have you tried that approach?  If you get the same sort of error then the problem might be that there is some sort of collinearity or quasi-separation in the &groupvar and &numvar variables.

 

SteveDenham

View solution in original post

6 REPLIES 6
SteveDenham
Jade | Level 19

The note you referenced makes it clear that the WEIGHT option is generally superior to the offset method.  I believe that you already have all the ingredients to create a weight variable, since these are included in the calculation of the offset variable 'off' (r sub1 and p sub1).  Have you tried that approach?  If you get the same sort of error then the problem might be that there is some sort of collinearity or quasi-separation in the &groupvar and &numvar variables.

 

SteveDenham

Lobbie
Obsidian | Level 7

Hi @SteveDenham

 

I did try using weight and am still getting the warning message.  While I can confirm there is no collinearity, it is possible there are some sort of quasi-separation issue in the Class variables like you suggested. Questions are,

  1. Why removing the offset or weight option, the warning message goes away?  I read somewhere if it is quasi-separation, the model is still ok because only maximum likelihood cannot be computed and we can ignore the warning.  I think I am fine with this as long as the model has good AUC, low misclassification, and the differences between these metrics of the train and test sets are not too big e.g. < 5% diff.
  2. I suppose with the use of offset or weight option in Proc Logistic, and to get the adjusted prediction for each observation, I will still have to use priorevent=&pi1 in the score statement? Example below (&pi1 is the prior event value),
    ods graphics on;
    proc logistic 
    	Data = work.train_stdize 
    	outmodel=work.mymodel 
    	outest=work.mdl_betas 
    	namelen=32;
    	class &class_var. / param=ref;
    	model responder(event='1') = &class_var. &num_var. / stb lackfit ctable pprob=0.5 offset=off;
    	score data=work.train_stdize fitstat out=work.trainpred outroc=work.troc priorevent=&pi1;
    	score data=work.validate_stdize fitstat out=work.validpred outroc=work.vroc priorevent=&pi1;
    	score data=work.osd_stdize fitstat out=work.osdpred outroc=work.oroc priorevent=&pi1;
    run;
    ods graphics off;

Thanks,

Lobbie

 

SteveDenham
Jade | Level 19

Since the error message still appears using the WEIGHT statement, this has moved beyond my experience.  Get with the folks at Tech Support - they should be able to decipher what is going on, and how to get the correction for oversampling done correctly.

 

SteveDenham

Lobbie
Obsidian | Level 7

Hi @SteveDenham

 

I tried again with some modifications to the Class variables with Weight method, and it worked now. I also notice when I used the Offset method, the DF of one of the levels in a categorical variable is 0 or NULL, and this may be the cause of the warning message. 

 

All good and thanks for your help,

Lobbie

Ksharp
Super User
Try using PEVENT= instead of OFFSET .

proc logistic data=sashelp.class;
model sex=weight height/pevent=0.2 ctable;
run;
Lobbie
Obsidian | Level 7

@Ksharp , I tried using PEVENT= as you suggested.  It did not change the values of the intercept and the coefficients.  It also did not adjust the predicted probabilities based on prior.  It did give me a classification table from 0% to 100% just as the documentation stated.  However, it is not what I am after.  Thanks.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 848 views
  • 1 like
  • 3 in conversation