Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

ICE Plots (Individual Conditional Expectation) in SAS?

Reply
Contributor
Posts: 27

ICE Plots (Individual Conditional Expectation) in SAS?

Hi!  Goldstein, Kapelner, Bleich, and Pitkin developed a nice tool for visualizing models estimated by any supervised learning algorithm, called ICE plots (see "Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation" https://www.researchgate.net/publication/257028373_Peeking_Inside_the_Black_Box_Visualizing_Statisti...).  Naturally there is an R package implementation (ICEbox).  Does SAS have an implementation, or any plans for producing one?  Thanks!

Super User
Posts: 9,681

Re: ICE Plots (Individual Conditional Expectation) in SAS?

It is more like EFFECTPLOT statement in SAS.
Check:

http://blogs.sas.com/content/iml/2016/03/21/statistical-analysis-stephen-curry-shooting.html

http://blogs.sas.com/content/iml/2016/06/22/sas-effectplot-statement.html


SAS Employee
Posts: 106

Re: ICE Plots (Individual Conditional Expectation) in SAS?

[ Edited ]

Sorry for the late response but I just saw this and happen to be reading the ICE article. 

 

This link shows one way to do partial dependence plots in SAS. The example uses regression but the technique is model agnostic. 

 

https://qizeresearch.wordpress.com/2013/12/12/partial-dependence-plot/

 

For ICE plots you simply skip the aggregation of the yHats, i.e., you plot each observation vector over the range of X values of the variable of interest. I like to overlay the PD function on the individual curves, as it gives you an idea of individual differences around the overall tendency, and it can help to discover interactions and interesting subgroups. There are examples of this in the ICE paper. 

 

Just note that scalability may be in issue with big data (lots of rows or high cardinality inputs) and there may be visualization challenges (clutter) because ICE gives you a separate curve for each observation in your dataset. Thus, you might consider sampling the "other" rows or binning the values of the variable of interest (especially high cardinality interval model inputs). Other tricks can be applied as well to select the interesting curves rather than plotting them all.  

 

Hope this helps. 

 

Ray

 

Ask a Question
Discussion stats
  • 2 replies
  • 296 views
  • 0 likes
  • 3 in conversation