Hi, I have a great confusion about what exactly SAS Enterprise Miner is doing with different validation settings. 1st case: For example, Let's suppose you only have one dataset node connected to a Regression node to perform logistic regression. After selecting Validation error for Selection criteria for Stepwise under Model selection, you run this simple model. What exactly is happening with validation here? what is the percentages used for training and validation sets? what kind of validation is used? k-fold, etc..which model is being selected? 2nd case: This time suppose you have a data partition mode in between dataset and regression nodes and you set the partition to 70% train and 30%validate (no test). Now when you run the model, which partition is being used for validation? the partition from the data partition node? or whatever validation technique is being used under Regression properties window? 3rd case: Suppose now you have another path from data partition to neural networks and finally a model comparison node that you connect both regression and neural network nodes into. What partition is used for the comparison statistics? what method is being used? I think my main problem is the difference between the validation procedure for a single model type versus when you have different models. my guessing is that, for single model, where there is no comparison with other models, validation just plays a role like testing, is that true? I greatly appreciate the help and please let me know if my questions are not clear. thanks Prof. Boylu
... View more