BookmarkSubscribeRSS Feed
newboy1218
Quartz | Level 8

Hi, I am trying to understand how SAS EM conducts the fuzzy method for reject inference. According to the documentation (Reject Inference Node or Reject Inference Techniques Implemented in Credit Scoring for SAS Enterprise Miner), SAS EM creates two observations in the augmented data set for each original observation in the rejects data set. In the first observation, a target value of 0 is assigned. In the second observation, a target value of 1 is assigned. The two observations are then individually weighted by the posterior probabilities, P(non-event) and P(event), respectively. The posterior probabilities, P(non-event) and P(event), are estimated from the model that was trained on the accepts (or known good-bad) data set.

 

A common frequency weight, called the reject weight, is then assigned to both observations to account for any over-sampling or under-sampling of the rejects data. The reject weight is computed as follows:

 
Capture.PNG
 
In the above equation, Naccepts is the weighted number of observations in the accepts data set. That is, it is the number of observations in the accepts data set after frequency weights have been applied. Nrejects is the number of observations in the rejects data set; it is unweighted. 
 
My question is, what is the weighted number of observations in the accepts data set? For example, let's assume we have a dataset on bank credit card application delinquency. This data has 100 funded records, 200 approved but not funded records, and 300 rejected records. Using these numbers, is it correct that:
  • rejection rate = 300 / (600) ?
  • Nrejects = 300?
  • What is our Naccepts in this example?

 

Thank you.

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 0 replies
  • 699 views
  • 0 likes
  • 1 in conversation