Re: Predictive Modeling Using Logistic Regression
At page 3-59 of the course text, variable transformation is suggested as a way of accounting for nonlinear relationship between input and output. However, the way the topics (and related SAS logic steps) are presented in the course, imputation of missing values is done in an earlier step (as part of the data preparation stage). On the other hand, throughout course "Applied Analytics Using SAS Enterprise Miner" it is emphasized that data imputation should be done after transforming variables (see page 4-53 of the course text): which way is the most appropriate or is either approach valid?
My response:
The following are best practice steps related to fitting regression models:
- Out of the following three pre-processing steps (re_coding categorical levels, interval input transformation and missing value imputation) before regression modeling, the missing value imputation step is the most significant step. That is why it is introduced first in the AAEM training in Ch4.
- Also we recommend that the missing value imputation step must be the last step before fitting the regression model