Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Partial Dependence Plot for boosting decision tree

Reply
Occasional Contributor
Posts: 5

Partial Dependence Plot for boosting decision tree

Hi all

I'm wondering how I can have partial dependence plot when I'm using boosted decision tree?

Thanks

SAS Employee
Posts: 122

Re: Partial Dependence Plot for boosting decision tree

Hi,

By boosted decision tree, you mean the Gradient Boosting node in EM? Jason Xin
Occasional Contributor
Posts: 5

Re: Partial Dependence Plot for boosting decision tree

Hi Jason

 

Thanks for reply. No, I'm developing my boosted tree using start and end group nodes in EM. 

SAS Employee
Posts: 122

Re: Partial Dependence Plot for boosting decision tree

In that case, I am afraid you need to stick a SAS Code node to write your own programs. Best Regards
Jason Xin
SAS Employee
Posts: 122

Re: Partial Dependence Plot for boosting decision tree

Hi,

In case you have not seen, there have been some SAS programs that do boosting which I think similar to what you try to do. Here is a link found at Google.com.
http://www.sas-programming.com/2010/03/implement-boosting-algorithm-in-sas.html
Occasional Contributor
Posts: 5

Re: Partial Dependence Plot for boosting decision tree

Hi Jason

Many thanks for that, however I could not find my answer in that link. I'm wondering I can extract partial dependence plot in R easily but in SAS ...

It is so frustrating for me that I'm using SAS EM to develop my models in my PhD thesis and now I have to come back to R.

Super Contributor
Posts: 336

Re: Partial Dependence Plot for boosting decision tree

Hi Art,

I looked into the partial dependence plot (2D and 3D versions) for gradient boosting and random forest about a year ago. I was not particularly impressed. It seemed useful when you have 2 or 3 variables, but I wasn't sure where that leads you when you have 4+ variables.

Since all partial dependence takes into account is "marginal effect of a variable on the class probability (classification) or response (regression)", I would much rather look at the variable importance coming out of the gradient boosting node.

 

If you have more insights about these plots, I will be happy to bring this up in our next development meeting. I am specially interested if these plots are something you would use in a real data set with 4 or more variables.

 

Thanks!

-Miguel

Occasional Contributor
Posts: 5

Re: Partial Dependence Plot for boosting decision tree

Hi Miguel

 

Thanks for your reply. But Partial dependence plot can be used when you have more than 3 variables as well. Partial dependency assists in identifying interaction between different variable in model and have a better interpretation. For example in my study (traffic crash study) using importance variable shows that population density is a significant factor, however how I can find in flouncing of this variable on model. I mean, it is not clear increasing population density increased traffic crashes or decreased it. I know it is possible to find it in SAS model code but it is difficult and time consuming. (for instance

 http://onlinepubs.trb.org/onlinepubs/conferences/2011/RSS/1/Chung,Y-S.pdf )

mnay thanks

 

Alireza

Super Contributor
Posts: 336

Re: Partial Dependence Plot for boosting decision tree

Thanks for the details, I will check out that paper and figure out if you can use a workaround to calculate them when you use Start/End group nodes.


Some input from one of my most tree-versed coworkers:

 

  • These plots are definitely useful although they might be misleading when the variable on the X-axis is strongly dpendent with other variables.
  • This video discusses the problems with them and an ICE method to resolve them.  [I haven't looked at it entirely, but wanted to pass it along ASAP].

https://www.youtube.com/watch?v=f55onMzbmfY​ 

 

 

Stay tuned and let's see if myself or someone from the community can come up with something.

 

Cheers,

M

Ask a Question
Discussion stats
  • 8 replies
  • 1066 views
  • 2 likes
  • 3 in conversation