Solved: Re: Assign Observation to Train and Validation using the Values for a ...

ggfggrr · Posted 02-20-2019 07:07 AM

I had already prepared the Training and Validation Dataset using the time considerations which needs a specific approach. I had created a separate variable ('TrainingOrValidation') to know whether the observation belongs to a 'Training' or 'Validation' set. Is there any way in SAS E-miner to assign the observation based upon the column values as above. I dont want SAS Miner to split itself as shown here and I am looking for the ways to inform the SAS E-miner about which are the observations belong to Training and which of those remaining belongs to Validation.
I would really appreciate any help.

Thanks

WendyCzika · Posted 02-20-2019 04:45 PM

I think you would need to do something like this in a SAS Code node in place of the Data Partition node:

data &em_export_train &em_export_validate;

set &em_import_data;

if strip(TrainingOrValidation)='Training' then output &em_export_train;

else if strip(TrainingOrValidation)='Validation' then output &em_export_validate;

run;

View solution in original post

WendyCzika · Posted 02-20-2019 04:45 PM

I think you would need to do something like this in a SAS Code node in place of the Data Partition node:

data &em_export_train &em_export_validate;

set &em_import_data;

if strip(TrainingOrValidation)='Training' then output &em_export_train;

else if strip(TrainingOrValidation)='Validation' then output &em_export_validate;

run;

ggfggrr · Posted 02-20-2019 05:45 PM

Thanks for your quick help. However, Can you kindly help me to understand the following;

1. I would appreciate how these variables are named and helps in splitting the dataset. Are the data sets names em_export_train and em_export_validate are automatically understood by SAS that these observations belong to Training and Validation respectively.?

These names can be of any name?

2. Do I still need Data Partition Node after the SAS code? or I can directly connect the SAS code to the Variables clustering/Integrative Grouping/Scorecard?

Thanks again

Kind regards,

Mari

WendyCzika · Posted 02-21-2019 09:07 AM

1. Yes, those macro variables will resolve to the correct name of the data sets. The only thing you would potentially change is the name of the variable that has the partition indicator, that I have as TrainingOrValidation and its values that I have as 'Training' and 'Validation'.

2. You do not need a Data Partition node after the SAS Code node, this is in place of the Data Partition node that you can then connect to whatever subsequent nodes.

Hope that helps!

ggfggrr · Posted 02-21-2019 09:12 AM

1. Thanks so much, I could see these names under 'Exported data'field in the properties tab. Also, As I see here below, I can also easily define all the code I wanted including for test/score dataset. Is my understanding right?.

2. I understand. Thats a lot of help from you, Wendy.

Kind regards,

Mari

Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

Re: Assign Observation to Train and Validation using the Values for a Particular Column

SAS Innovate 2025: Call for Content