BookmarkSubscribeRSS Feed
ADChau
Calcite | Level 5

Hello everyone, I am a student studying SAS Miner the first time on this semester. I am working in a team project and I want to create a new variable for the original data source. I planned to use Transform Variables to type SAS code and add a variable

Net_Gain = Capital_Gain - Capital Lost.

 

I wonder if I do the Data Partition before the Transform Variables or Data Partition after Transform Variable node?

 

1. If Data Patition --> Transform Variable: Will I create new variable for only trainning set?

 

2. If Transform Variable --> Data Partition: which means I will create a new variables for the whole data set? will this impact the Scoring data accuracy because the score data doesn't have the Net_Gain variable.

 

Thank you very much, 


1.png
1 REPLY 1
Reeza
Super User

It doesn't actually matter. If the variable ends up being used in the final model, the final scoring code will account for it. 

This is one of the nice features of SAS EM - it can replicate the process from the start including variable transformations.

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1470 views
  • 1 like
  • 2 in conversation