Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modelling?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modelling?

[ Edited ]

Hi,

 

I have 2 separate datasets i.e. 1 for training and 1 for testing.  The variables are the same except for the training dataset, it contains a Target variable.  The Target variable is ordinal and contains value Low, Medium and High.  Each record has an unique identifier 'TranID' and variables used for modelling.  I want to build a simple decision tree to predict the probabilities of High, Medium and Low for each record in the test dataset.

 

My questions are,

  1. How do I add the test dataset into the process flow because in the SAS EM guides, they all showed a raw input dataset and then use the Data Partition node to partition into Train, Validate and Test datasets.
  2. Will the decision tree output the test results by "TranID, Low (probability), Medium (probability), High (probability)" which I can export into a text file?  If yes, which node(s) do I have to use?

I am a SAS EM newbie and am using SAS EM version 14.1.

 

Thank you very much in advance.

 

Regards,

Lobbie


Accepted Solutions
Solution
‎04-08-2017 09:29 PM
SAS Super FREQ
Posts: 306

Re: SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modell

What you have for your test data set is what Enterprise Miner considers a "score" data set.  To get predictions for that data set, you can connect both the Input Data node for that data set (with the Role property set to Score) and the modeling node that you want to use for your predictions (that uses your training data) to a Score node, as in the attached screenshot of a sample flow.  The Score node will apply the score code from the model to your score data set that doesn't contain the target.  Hope that helps!

 


ScoreFlow.png

View solution in original post


All Replies
Solution
‎04-08-2017 09:29 PM
SAS Super FREQ
Posts: 306

Re: SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modell

What you have for your test data set is what Enterprise Miner considers a "score" data set.  To get predictions for that data set, you can connect both the Input Data node for that data set (with the Role property set to Score) and the modeling node that you want to use for your predictions (that uses your training data) to a Score node, as in the attached screenshot of a sample flow.  The Score node will apply the score code from the model to your score data set that doesn't contain the target.  Hope that helps!

 


ScoreFlow.png
SAS Employee
Posts: 66

Re: SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modell

Posted in reply to WendyCzika

You can designate your data to be whatever role you want it to be. I created the image below to show you one file is set to Train and the second file is set to Test. I can then feed them both into the next node in the flow.

 

EM_NEWPROJECT2.gif

SAS Employee
Posts: 66

Re: SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modell

Posted in reply to MelodieRush

To save to a .txt file you can use the Save Data Node under the utility Tab

 

2017-03-16_13-00-48.png

Occasional Contributor
Posts: 6

Re: SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modell

Posted in reply to MelodieRush

Hi Wendy and Melodie,

 

Thank you both for your answers.  

 

@MelodieRush, if I need to score the Test dataset later, do I just connect the Score node to the Decision Tree node as shown by @WendyCzika and SAS EM will know which dataset is which?

 

Regards,

Will

SAS Employee
Posts: 66

Re: SAS Enterprise Miner - How do I include 2 Data sets into a process flow for Decision Tree modell

Yes, you can never go wrong following @WendyCzika's advice! EM has lots of different data types, training, valadation, test, score, transaction and raw. It knows what to do when your data is correctly identified and defined.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 804 views
  • 3 likes
  • 3 in conversation