BookmarkSubscribeRSS Feed
jwoods
Calcite | Level 5

Hi,

 

I have experience in SAS but I'm having trouble using SAS EM. I have a few questions:

 

1. How is purity measured/ranked?

2. Is the purity of a node based on the training data or the validation data? 

3. Why is the purity of a node indicated by training/validation (answer to number 2) data?

4. How do you know (by looking at a tree), which nodes are the purist?

 

Thanks

1 REPLY 1
WendyCzika
SAS Employee

You can use Gini impurity as the splitting criterion when growing the decision tree, which has formula:

 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1363 views
  • 0 likes
  • 2 in conversation