In SAS decision trees, ’10 repeats’ means 10-fold cross-validation 10 times, for a total of 101 trees, including the original tree.
'Leave-one-out' cross-validation has been available in the EM Regression Node. In leave-one-out CV, n = the total # of observations in your data set.
Re: Using CV, do you still partition your data into training and validation subsets?
Not for a single EM modelling node. However, partitioning into data-available-for-CV vs test-hold-out is still useful, and if comparing models from several EM modeling nodes, using a single validation data set for the comparison may be useful. It's up to the analyst.
Re: primarily used when small data sets are not large enough for partitioning
That is my belief. Partitioning applies hold-out data directly to the model being deployed, providing a transparently unbiased estimate of accuracy. CV validates the model construction process. People disagree as to whether leave-one-out cross-validation provides unbiased or overrly optimistic estimates of prediction.
However, many people prefer to CV anything, regardless of size.