BookmarkSubscribeRSS Feed
jwoods
Calcite | Level 5

Hi,

 

I have experience in SAS but I'm having trouble using SAS EM. I have a few questions:

 

1. How is purity measured/ranked?

2. Is the purity of a node based on the training data or the validation data? 

3. Why is the purity of a node indicated by training/validation (answer to number 2) data?

4. How do you know (by looking at a tree), which nodes are the purist?

 

Thanks

1 REPLY 1
WendyCzika
SAS Employee

You can use Gini impurity as the splitting criterion when growing the decision tree, which has formula:

 

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1342 views
  • 0 likes
  • 2 in conversation