BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mike90
Quartz | Level 8

I've spent an hour and half typing in Google searches for the phrases found on these SAS charts.

 

I found two images on Google where other people asked this same exact question, and they never got an answer.

 

These charts are produced by the Model Comparison node when the target is interval.

 

Thanks for any explanatory links. 

 

 I'm looking for something like the explanation of cumulative lift charts - which everyone uses, not just SAS - which typically says something like "this graph shows that the model can select 25% of the data where it can predict the target 4 times better than chance."

 

Thanks.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
DougWielenga
SAS Employee

Mike90,

 

The thing to remember about most of the output in SAS Enterprise Miner nodes is that they are not stored graphs but rather they are graphs created on the fly from underlying data sets.  The data sets rather than the graphs are stored so that the graphs can be created on the fly when you request to view the results of a node.   You can see the underlying table on which most graphs are built by simply clicking on the graph and then clicking on View --> Table.  You can also click on the table icon in the top left corner of the Results browser when the graph of interest is selected to reveal the table. 

 

It is important to note that the graphical results must be created for any possible model so you will not see certain classical graphs associated with a particular type of analysis if that graph can't be created for all types of models.  The Model Comparison node computes statistics based on binning the observations.  You can specify the number of bins (Number of Bins property) which is 20 by default which creates demi-decile or 5% groupings of the data.  Viewing the underlying table in light of these groupings should make the interpretation much more clear.  

 

 

In the Score Rankings Overlay: Value table, you see the Mean Predicted (average predicted for each bin) on the Y-axis and the Depth (percentile group) which goes from 0 to 100 on the X-axis.  The table shows the statistics computed for Depth = 5% (the first bin containing the top 5%), Depth=10% (the second bin containing the next 5%), Depth=15% (the third bin), ... , up to Depth=100% (the last and lowest 5% bin).   The underlying table reveals that those values have been computed for each model for each partition of data for each bin.  It also reports the Mean/Min/Max Target values for each bin as well as the Mean/Min/Max Predicted target values for each bin.   The graph just plots the predicted Mean/Min/Max for each partition overlaying the values for each model on the same graph.

 

In the Score Distribution: Value table, you see the actual Mean/Min/Max Predicted on the Y-axis and the Model Score on the X-axis.  While I need to investigate how the Model Score column is calculated, it is a reference value between the min and max for the bin so that the plotted values will follow roughly along the line Y=X.  If you do not like that graph, you can you can always select the graph and then click on Edit --> Data Options... in order to specify a different variable from the data set as the X variable.  

 

Hope this helps!

Doug

View solution in original post

2 REPLIES 2
DougWielenga
SAS Employee

Mike90,

 

The thing to remember about most of the output in SAS Enterprise Miner nodes is that they are not stored graphs but rather they are graphs created on the fly from underlying data sets.  The data sets rather than the graphs are stored so that the graphs can be created on the fly when you request to view the results of a node.   You can see the underlying table on which most graphs are built by simply clicking on the graph and then clicking on View --> Table.  You can also click on the table icon in the top left corner of the Results browser when the graph of interest is selected to reveal the table. 

 

It is important to note that the graphical results must be created for any possible model so you will not see certain classical graphs associated with a particular type of analysis if that graph can't be created for all types of models.  The Model Comparison node computes statistics based on binning the observations.  You can specify the number of bins (Number of Bins property) which is 20 by default which creates demi-decile or 5% groupings of the data.  Viewing the underlying table in light of these groupings should make the interpretation much more clear.  

 

 

In the Score Rankings Overlay: Value table, you see the Mean Predicted (average predicted for each bin) on the Y-axis and the Depth (percentile group) which goes from 0 to 100 on the X-axis.  The table shows the statistics computed for Depth = 5% (the first bin containing the top 5%), Depth=10% (the second bin containing the next 5%), Depth=15% (the third bin), ... , up to Depth=100% (the last and lowest 5% bin).   The underlying table reveals that those values have been computed for each model for each partition of data for each bin.  It also reports the Mean/Min/Max Target values for each bin as well as the Mean/Min/Max Predicted target values for each bin.   The graph just plots the predicted Mean/Min/Max for each partition overlaying the values for each model on the same graph.

 

In the Score Distribution: Value table, you see the actual Mean/Min/Max Predicted on the Y-axis and the Model Score on the X-axis.  While I need to investigate how the Model Score column is calculated, it is a reference value between the min and max for the bin so that the plotted values will follow roughly along the line Y=X.  If you do not like that graph, you can you can always select the graph and then click on Edit --> Data Options... in order to specify a different variable from the data set as the X variable.  

 

Hope this helps!

Doug

Mike90
Quartz | Level 8

Thank you.  Now I can read those graphs.

 

Setting the X axis to predicted target and having it group the models in one chart produces a very useful graphic.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1884 views
  • 1 like
  • 2 in conversation