Hi Zach, HPForest is not just multiple decision tree runs. HPForest is a very specific type of decision tree ensemble. For each decision tree that you train you are not using every observation, and not all variables are candidates for splits. It does not sound intuitive at first, but Breiman and other authors have demonstrated that this approach works best for a robust model. Once you decide to use a model with low interpretability like a gradient boosting, a random forest, an SVM, or a neural network, you have traded off interpretability for better prediction. One useful trick to better understand the variables driving your model for a binary target: 1. Add a Model Comparison node, a Score node, and a Reporter node after your model. 2. For the reporter node set the Nodes option to SUMMARY. Run this flow and open the results. 3. Notice that the pdf report ran the Rapid Predictive Modeler reports for your model. This report includes the Selected Variable Importance chart based on a decision tree of your predicted event. You can use this chart to explain the main drivers of your model. I find it easier to use this report even for a model like HPForest that already outputs variable importance. I think this chart is easier to explain than the out-of-bag error reduction, and the results usually match. Before trying to make a recommendation for WWSCMD, please share some info and charts: -proportion of events to non-events of your target variable? is it a rare event? -iteration plot for your HPForest -plots from your Cutoff node results including ROC, positive rates, and precision recall cutoff I hope this helps! Thanks, -Miguel
... View more