Hi, Would be grateful if some expert on the forum can help me understand how to decide optimum number of leaves in a decision tree analysis. I am using SAS and if I supply leaves=6 in my model then miss-classification rates for validation & training data sets are 18.6% & 18.8% respectively. And SAS lists 5 variables which are significant. And if I don't supply leaves count in the code and let SAS decide it, then SAS after pruning takes 10 as leaves count and miss-classification rates for validation & training data sets are 17.5% & 16.9% respectively. And SAS lists 6 variables which are significant. Now that the miss-classification rates have reduced & trees after pruning have increased from 4 to 10, is it a good thing or it indicates overfitting? Looking forward to opinions of experts in this group. Thanks & Regards Vikrant
... View more