11-08-2016 10:21 AM
Hello everyone, I am a student studying SAS Miner the first time on this semester. I am working in a team project and I want to create a new variable for the original data source. I planned to use Transform Variables to type SAS code and add a variable
Net_Gain = Capital_Gain - Capital Lost.
I wonder if I do the Data Partition before the Transform Variables or Data Partition after Transform Variable node?
1. If Data Patition --> Transform Variable: Will I create new variable for only trainning set?
2. If Transform Variable --> Data Partition: which means I will create a new variables for the whole data set? will this impact the Scoring data accuracy because the score data doesn't have the Net_Gain variable.
Thank you very much,
11-08-2016 02:32 PM
It doesn't actually matter. If the variable ends up being used in the final model, the final scoring code will account for it.
This is one of the nice features of SAS EM - it can replicate the process from the start including variable transformations.