BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pvareschi
Quartz | Level 8
Re: Applied Analytics Using SAS Enterprise Miner
I would be very grateful if someone could clarify the concepts/definitions of Gain, Gain Chart and Cumulative Gain, since I am a bit confused, probably due to the fact the terminology does not seem to be used consistently across the industry:
 
1. Gain: this metric is reported in the Output window (under "Statistics Table") of the "Model Comparison" node (see page 6-7 of course notes); what is its formula/definition?
Is this based on the definition given at page 256 of "Enterprise Miner 15.1: Reference Help": ((% of events in decile / random % of events in decile)-1).
If so, what is its interpretation?
 
2. Gain Chart: this is the chart displayed as part of the "Score Ranking Overlay" output when selecting option "Gain"; below is a screenshot taken from the example/demonstration in "Lesson 7: Model Assessment Using SAS Enterprise Miner" (see also page 6-19 of the course notes); again, how are the values on the Y-axis calculated?
Gain_chart.png
 
3. Cumulative Gain: page 6-17 of the course notes states that "cumulative percent response" chart is more widely known as "cumulative gain" in the predictive modeling literature. It also adds that "[...] Plotting cumulative gain for all selection fractions yields a gains chart"; at page 6-20, it says "It is instructive to view the actual proportion of cases with the primary outcome (called gain or cumulative percent response) at each decile":
(a) from other sources on internet (see this as an example), it seems that "cumulative gain" is related to the "percentage of the total possible positive responses (i.e. primary outcome events) at a given depth" (in the "Score Ranking Overlay" window, that is given by "Cumulative % Capture Response"); is this just an example of inconsistency in the use of the same term?
(b) how does the "cumulative gain" differ from the Gain Chart in point (2) above?
 
1 ACCEPTED SOLUTION

Accepted Solutions
gcjfernandez
SAS Employee

1. Gain: this metric is reported in the Output window (under "Statistics Table") of the "Model Comparison" node (see page 6-7 of course notes); what is its formula/definition?
Is this based on the definition given at page 256 of "Enterprise Miner 15.1: Reference Help": ((% of events in decile / random % of events in decile)-1).
If so, what is its interpretation?

MY ANSWER:

Both LIFT and GAIN statistics are computed at the depth of 10th decile (by default) and Gain=Lift-1. The formula given above is correct for the Gain.

2. Gain Chart: this is the chart displayed as part of the "Score Ranking Overlay" output when selecting option "Gain"; below is a screenshot taken from the example/demonstration in "Lesson 7: Model Assessment Using SAS Enterprise Miner" (see also page 6-19 of the course notes); again, how are the values on the Y-axis calculated?

My answer:

The values Y-axis is Lift-1

3. Cumulative Gain: page 6-17 of the course notes states that "cumulative percent response" chart is more widely known as "cumulative gain" in the predictive modeling literature. It also adds that "[...] Plotting cumulative gain for all selection fractions yields a gains chart"; at page 6-20, it says "It is instructive to view the actual proportion of cases with the primary outcome (called gain or cumulative percent response) at each decile":
(a) from other sources on internet (see this as an example), it seems that "cumulative gain" is related to the "percentage of the total possible positive responses (i.e. primary outcome events) at a given depth" (in the "Score Ranking Overlay" window, that is given by "Cumulative % Capture Response"); is this just an example of inconsistency in the use of the same term?
(b) how does the "cumulative gain" differ from the Gain Chart in point (2) above?

My answer:

Cumulative gain is equal to Cumulative %  Response, Therefore SAS EM is only showing Cumulative %  Response.

Please note that Cumulative % Capture Response = (Cumulative % of events in a decile / total number of events) is different from Cumulative %  Response = (Cumulative % of events in a decile).

Please let me know if you have any further questions.

 

View solution in original post

1 REPLY 1
gcjfernandez
SAS Employee

1. Gain: this metric is reported in the Output window (under "Statistics Table") of the "Model Comparison" node (see page 6-7 of course notes); what is its formula/definition?
Is this based on the definition given at page 256 of "Enterprise Miner 15.1: Reference Help": ((% of events in decile / random % of events in decile)-1).
If so, what is its interpretation?

MY ANSWER:

Both LIFT and GAIN statistics are computed at the depth of 10th decile (by default) and Gain=Lift-1. The formula given above is correct for the Gain.

2. Gain Chart: this is the chart displayed as part of the "Score Ranking Overlay" output when selecting option "Gain"; below is a screenshot taken from the example/demonstration in "Lesson 7: Model Assessment Using SAS Enterprise Miner" (see also page 6-19 of the course notes); again, how are the values on the Y-axis calculated?

My answer:

The values Y-axis is Lift-1

3. Cumulative Gain: page 6-17 of the course notes states that "cumulative percent response" chart is more widely known as "cumulative gain" in the predictive modeling literature. It also adds that "[...] Plotting cumulative gain for all selection fractions yields a gains chart"; at page 6-20, it says "It is instructive to view the actual proportion of cases with the primary outcome (called gain or cumulative percent response) at each decile":
(a) from other sources on internet (see this as an example), it seems that "cumulative gain" is related to the "percentage of the total possible positive responses (i.e. primary outcome events) at a given depth" (in the "Score Ranking Overlay" window, that is given by "Cumulative % Capture Response"); is this just an example of inconsistency in the use of the same term?
(b) how does the "cumulative gain" differ from the Gain Chart in point (2) above?

My answer:

Cumulative gain is equal to Cumulative %  Response, Therefore SAS EM is only showing Cumulative %  Response.

Please note that Cumulative % Capture Response = (Cumulative % of events in a decile / total number of events) is different from Cumulative %  Response = (Cumulative % of events in a decile).

Please let me know if you have any further questions.

 

 

This is a knowledge-sharing community for learners in the Academy. Find answers to your questions or post here for a reply.
To ensure your success, use these getting-started resources:

Estimating Your Study Time
Reserving Software Lab Time
Most Commonly Asked Questions
Troubleshooting Your SAS-Hadoop Training Environment

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1866 views
  • 0 likes
  • 2 in conversation