I had already prepared the Training and Validation Dataset using the time considerations which needs a specific approach. I had created a separate variable ('TrainingOrValidation') to know whether the observation belongs to a 'Training' or 'Validation' set. Is there any way in SAS E-miner to assign the observation based upon the column values as above. I dont want SAS Miner to split itself as shown here and I am looking for the ways to inform the SAS E-miner about which are the observations belong to Training and which of those remaining belongs to Validation. 
I would really appreciate any help.
Thanks
I think you would need to do something like this in a SAS Code node in place of the Data Partition node:
data &em_export_train &em_export_validate;
set &em_import_data;
if strip(TrainingOrValidation)='Training' then output &em_export_train;
else if strip(TrainingOrValidation)='Validation' then output &em_export_validate;
run;
I think you would need to do something like this in a SAS Code node in place of the Data Partition node:
data &em_export_train &em_export_validate;
set &em_import_data;
if strip(TrainingOrValidation)='Training' then output &em_export_train;
else if strip(TrainingOrValidation)='Validation' then output &em_export_validate;
run;
Thanks for your quick help. However, Can you kindly help me to understand the following;
1. I would appreciate how these variables are named and helps in splitting the dataset. Are the data sets names em_export_train and em_export_validate are automatically understood by SAS that these observations belong to Training and Validation respectively.?
These names can be of any name?
2. Do I still need Data Partition Node after the SAS code? or I can directly connect the SAS code to the Variables clustering/Integrative Grouping/Scorecard?
Thanks again
Kind regards,
Mari
1. Yes, those macro variables will resolve to the correct name of the data sets. The only thing you would potentially change is the name of the variable that has the partition indicator, that I have as TrainingOrValidation and its values that I have as 'Training' and 'Validation'.
2. You do not need a Data Partition node after the SAS Code node, this is in place of the Data Partition node that you can then connect to whatever subsequent nodes.
Hope that helps!
1. Thanks so much, I could see these names under 'Exported data'field in the properties tab. Also, As I see here below, I can also easily define all the code I wanted including for test/score dataset. Is my understanding right?.
2. I understand. Thats a lot of help from you, Wendy.
Kind regards,
Mari
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.
