BookmarkSubscribeRSS Feed
ADChau
Calcite | Level 5

Hello everyone, I am a student studying SAS Miner the first time on this semester. I am working in a team project and I want to create a new variable for the original data source. I planned to use Transform Variables to type SAS code and add a variable

Net_Gain = Capital_Gain - Capital Lost.

 

I wonder if I do the Data Partition before the Transform Variables or Data Partition after Transform Variable node?

 

1. If Data Patition --> Transform Variable: Will I create new variable for only trainning set?

 

2. If Transform Variable --> Data Partition: which means I will create a new variables for the whole data set? will this impact the Scoring data accuracy because the score data doesn't have the Net_Gain variable.

 

Thank you very much, 


1.png
1 REPLY 1
Reeza
Super User

It doesn't actually matter. If the variable ends up being used in the final model, the final scoring code will account for it. 

This is one of the nice features of SAS EM - it can replicate the process from the start including variable transformations.

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1986 views
  • 1 like
  • 2 in conversation