Fuzzy reject inference in SAS Enterprise Miner

newboy1218 — Wed, 24 Feb 2021 23:06:08 GMT

Hi, I am trying to understand how SAS EM conducts the fuzzy method for reject inference. According to the documentation (Reject Inference Node or Reject Inference Techniques Implemented in Credit Scoring for SAS Enterprise Miner), SAS EM creates two observations in the augmented data set for each original observation in the rejects data set. In the first observation, a target value of 0 is assigned. In the second observation, a target value of 1 is assigned. The two observations are then individually weighted by the posterior probabilities, P(non-event) and P(event), respectively. The posterior probabilities, P(non-event) and P(event), are estimated from the model that was trained on the accepts (or known good-bad) data set.

A common frequency weight, called the reject weight, is then assigned to both observations to account for any over-sampling or under-sampling of the rejects data. The reject weight is computed as follows:

In the above equation, Naccepts is the weighted number of observations in the accepts data set. That is, it is the number of observations in the accepts data set after frequency weights have been applied. Nrejects is the number of observations in the rejects data set; it is unweighted.

My question is, what is the weighted number of observations in the accepts data set? For example, let's assume we have a dataset on bank credit card application delinquency. This data has 100 funded records, 200 approved but not funded records, and 300 rejected records. Using these numbers, is it correct that:

rejection rate = 300 / (600) ?
Nrejects = 300?
What is our Naccepts in this example?

Thank you.

topic Fuzzy reject inference in SAS Enterprise Miner in SAS Data Science

Fuzzy reject inference in SAS Enterprise Miner