BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pvareschi
Quartz | Level 8

Re: Predictive Modeling Using Logistic Regression

At page 3-59 of the course text, variable transformation is suggested as a way of accounting for nonlinear relationship between input and output. However, the way the topics (and related SAS logic steps) are presented in the course, imputation of missing values is done in an earlier step (as part of the data preparation stage). On the other hand, throughout course "Applied Analytics Using SAS Enterprise Miner" it is emphasized that data imputation should be done after transforming variables (see page 4-53 of the course text): which way is the most appropriate or is either approach valid?

1 ACCEPTED SOLUTION

Accepted Solutions
gcjfernandez
SAS Employee

Re: Predictive Modeling Using Logistic Regression

At page 3-59 of the course text, variable transformation is suggested as a way of accounting for nonlinear relationship between input and output. However, the way the topics (and related SAS logic steps) are presented in the course, imputation of missing values is done in an earlier step (as part of the data preparation stage). On the other hand, throughout course "Applied Analytics Using SAS Enterprise Miner" it is emphasized that data imputation should be done after transforming variables (see page 4-53 of the course text): which way is the most appropriate or is either approach valid?

My response:

The following are best practice steps related to fitting regression models:

  • Out of the following three pre-processing steps (re_coding categorical levels, interval input transformation and missing value imputation) before regression modeling, the missing value imputation step  is the most significant step. That is why it is introduced first in the AAEM training in Ch4.
  • Also we recommend that the missing value imputation step must be the last step before fitting the regression model

View solution in original post

1 REPLY 1
gcjfernandez
SAS Employee

Re: Predictive Modeling Using Logistic Regression

At page 3-59 of the course text, variable transformation is suggested as a way of accounting for nonlinear relationship between input and output. However, the way the topics (and related SAS logic steps) are presented in the course, imputation of missing values is done in an earlier step (as part of the data preparation stage). On the other hand, throughout course "Applied Analytics Using SAS Enterprise Miner" it is emphasized that data imputation should be done after transforming variables (see page 4-53 of the course text): which way is the most appropriate or is either approach valid?

My response:

The following are best practice steps related to fitting regression models:

  • Out of the following three pre-processing steps (re_coding categorical levels, interval input transformation and missing value imputation) before regression modeling, the missing value imputation step  is the most significant step. That is why it is introduced first in the AAEM training in Ch4.
  • Also we recommend that the missing value imputation step must be the last step before fitting the regression model