I am using HP Forst node in SAS Enterprise Miner 14.2 to select the top X (i.e. the top 10) important input variables among hundreds of input variables to predict a binary target variable. To do so, in the property window of HP Forest node, I selected "Yes" for Variable Selection option. Then, to specify how many variables to select, you should select "Random Branch Assignments (RBA)" as the Variable Importance Method. Once you do this selection, SAS EM gives you the possibility to manually enter the Number of Variables to Consider (i.e. pick the top 10 important variables).
SAS EM Help says the "RBA Margin Reduction" should be considered as a measure of variable importance when you have a class target variable (i.e. binary target variable). As expected, in the HP Forest node output window and in the Variable Importance table, you will see the RBA Margin Reduction column. I have attached an example Variable Importance table in which I have asked SAS EM to select the top 22 important variables in my data. My problem is that I don't know how the numbers in the RBA Margin Reduction column are calculated. What is the equation (or procedure) for calculating RBA Margin Reduction?
The answer may be found in the article below, but unfortunately, I couldn't get access to the article.
Neville, P. G., and Tan, P.-Y. (2014). “A Forest Measure of Variable Importance Resistant to Correlations.” In Proceedings of the 2014 Joint Statistical Meetings. Alexandria, VA: American Statistical Association.
I appreciate if you help me with my question.
A. J.