Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

[Miner] When should I do data partition?

Posts: 1

[Miner] When should I do data partition?

Hello everyone, I am a student studying SAS Miner the first time on this semester. I am working in a team project and I want to create a new variable for the original data source. I planned to use Transform Variables to type SAS code and add a variable

Net_Gain = Capital_Gain - Capital Lost.


I wonder if I do the Data Partition before the Transform Variables or Data Partition after Transform Variable node?


1. If Data Patition --> Transform Variable: Will I create new variable for only trainning set?


2. If Transform Variable --> Data Partition: which means I will create a new variables for the whole data set? will this impact the Scoring data accuracy because the score data doesn't have the Net_Gain variable.


Thank you very much, 

Super User
Posts: 19,815

Re: [Miner] When should I do data partition?

It doesn't actually matter. If the variable ends up being used in the final model, the final scoring code will account for it. 

This is one of the nice features of SAS EM - it can replicate the process from the start including variable transformations.



Ask a Question
Discussion stats
  • 1 reply
  • 1 like
  • 2 in conversation