BookmarkSubscribeRSS Feed
JKarp_11
Calcite | Level 5

Dear community,

I need to better understand what the property „Perform Cross Validation“ in the section „Cross Validation“ for a decision tree does in general.

For me the purpose of cross validation (CV) is not to help select a particular tree (as the final model) but rather to qualify a model (which is created by 100% of the training sample before the CV), i.e. to provide metrics such as the average MSE (average of all “sub-trees” generated by the CV) which can be useful in asserting the level of precision one can expect from the application.

Now I have run two trees separately, one with “Perform Cross Validation”=yes and one without. The trees are different, i.e. the tree with CV=yes has less leaves. According to this outcome I assume that the enterprise miner uses a specific tree created by the CV as the final model (probably the one with the smallest MSE). I.e. a tree which is trained by 100-X% instead of 100% of the initial training sample.

Or does the results of the cross validation (average MSE) are used for pruning the original tree? However in this case pruning would be executed after CV…In my case I have selected the pruning property method “assessment” in section subtree.

I already thank you for your precious assistance! As it is a general question I hope this can be answered without data, codes.

Best regards

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 0 replies
  • 920 views
  • 0 likes
  • 1 in conversation