Dear community, I need to better understand what the property „Perform Cross Validation“ in the section „Cross Validation“ for a decision tree does in general. For me the purpose of cross validation (CV) is not to help select a particular tree (as the final model) but rather to qualify a model (which is created by 100% of the training sample before the CV), i.e. to provide metrics such as the average MSE (average of all “sub-trees” generated by the CV) which can be useful in asserting the level of precision one can expect from the application. Now I have run two trees separately, one with “Perform Cross Validation”=yes and one without. The trees are different, i.e. the tree with CV=yes has less leaves. According to this outcome I assume that the enterprise miner uses a specific tree created by the CV as the final model (probably the one with the smallest MSE). I.e. a tree which is trained by 100-X% instead of 100% of the initial training sample. Or does the results of the cross validation (average MSE) are used for pruning the original tree? However in this case pruning would be executed after CV…In my case I have selected the pruning property method “assessment” in section subtree. I already thank you for your precious assistance! As it is a general question I hope this can be answered without data, codes. Best regards
... View more