Very thorough reply by Neville who implemented all of these methods in SAS. A couple of supporting comments:
1. Outliers The base model used in RF is a large decision tree. Decision trees are robust to outliers, because they isolate them in small regions of the feature space. Then, since the prediction for each leaf is the average (for regression) or the majority class (for classification), being isolated in separate leaves, outliers won't influence the rest of the predictions (in the case of regression for instance, they would not impact the mean of the other leaves)
2, Validation Data -- Yes please use in teh common case of rare events where the OOB might not be sufficient.
3. Transforms - consider continous Y as with many algorithms.
... View more