<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created) in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/577911#M7928</link>
    <description>&lt;P&gt;Hi PrestickNinja,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I created a simple diagram to see if I could replicate the issue that you were experiencing.&amp;nbsp; In short, I was unable to replicate your issue, but it gave me some ideas on what might be going wrong for you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I discovered that I needed two SAS Code nodes to separate out my out-of-time dataset.&amp;nbsp; The first one to remove the Out-of-time data from the soon-to-be training and validation data.&amp;nbsp; The second to create the test data with only the out-of-time data.&amp;nbsp; From there I needed to link the Test Data into the flow after the the Non-Test data was partitioned.&amp;nbsp; Attached is a picture of the flow.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Code in "Remove Test Data" node:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data &amp;amp;EM_EXPORT_TRAIN;
	set &amp;amp;EM_IMPORT_DATA;
	where Origin ne "Asia";
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Code in "Keep only Test Data" node:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data &amp;amp;EM_EXPORT_TEST;
	set &amp;amp;EM_IMPORT_DATA;&lt;BR /&gt;&lt;/CODE&gt;&lt;CODE class=" language-sas"&gt;	&lt;/CODE&gt;where Origin = "Asia";&lt;BR /&gt;run;&lt;/PRE&gt;&lt;P&gt;The data in my diagram is from sasHelp.Cars&lt;/P&gt;&lt;P&gt;The partitioning percentages for the data partition node are 70/30/0&lt;/P&gt;&lt;P&gt;The Model comparison node is there to show that the Test data was successfully passed through.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please let me know if you have any questions!&amp;nbsp; Good luck!&lt;/P&gt;</description>
    <pubDate>Tue, 30 Jul 2019 22:22:22 GMT</pubDate>
    <dc:creator>Urban_Science</dc:creator>
    <dc:date>2019-07-30T22:22:22Z</dc:date>
    <item>
      <title>How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/577654#M7926</link>
      <description>&lt;P&gt;I am struggling with an odd issue with Miner.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The context - I am creating 3 sets of data within my project - a training, validation and test set. So far, so simple. The issue has come in where I want the test set to be a purely out-of-time set (the few months after the development period), while the validation and training set are just a 30/70 split of the remaining, in-time data.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What I have done:&lt;/P&gt;&lt;P&gt;I have placed SAS Code nodes after the raw dataset, which split the data into and in-and out-of-time set. This way the in time set can be partitioned as usual using the Data Partition node. In the code I specify that the out of time is the test set (using the macro variable for the test&amp;nbsp;export set). The problem is that Miner insists on creating passing through the original training set too.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is causing issues when I feed this node into the modelling steps after I split the validation set, as there are now 2 training sets. I can't figure out how to make Miner drop the one or at least allow me to select one. The crude solution I have at the moment is to be sure that the correct training set is "on top" when the process is laid out, but this is not a permanent fix as once the model is handed off just rearranging the physical position of the nodes will break the process.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Am I missing something obvious or is there no way to prevent a SAS Code node from exporting a training set?&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jul 2019 10:38:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/577654#M7926</guid>
      <dc:creator>PrestickNinja</dc:creator>
      <dc:date>2019-07-30T10:38:13Z</dc:date>
    </item>
    <item>
      <title>Re: How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/577911#M7928</link>
      <description>&lt;P&gt;Hi PrestickNinja,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I created a simple diagram to see if I could replicate the issue that you were experiencing.&amp;nbsp; In short, I was unable to replicate your issue, but it gave me some ideas on what might be going wrong for you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I discovered that I needed two SAS Code nodes to separate out my out-of-time dataset.&amp;nbsp; The first one to remove the Out-of-time data from the soon-to-be training and validation data.&amp;nbsp; The second to create the test data with only the out-of-time data.&amp;nbsp; From there I needed to link the Test Data into the flow after the the Non-Test data was partitioned.&amp;nbsp; Attached is a picture of the flow.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Code in "Remove Test Data" node:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data &amp;amp;EM_EXPORT_TRAIN;
	set &amp;amp;EM_IMPORT_DATA;
	where Origin ne "Asia";
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Code in "Keep only Test Data" node:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data &amp;amp;EM_EXPORT_TEST;
	set &amp;amp;EM_IMPORT_DATA;&lt;BR /&gt;&lt;/CODE&gt;&lt;CODE class=" language-sas"&gt;	&lt;/CODE&gt;where Origin = "Asia";&lt;BR /&gt;run;&lt;/PRE&gt;&lt;P&gt;The data in my diagram is from sasHelp.Cars&lt;/P&gt;&lt;P&gt;The partitioning percentages for the data partition node are 70/30/0&lt;/P&gt;&lt;P&gt;The Model comparison node is there to show that the Test data was successfully passed through.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please let me know if you have any questions!&amp;nbsp; Good luck!&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jul 2019 22:22:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/577911#M7928</guid>
      <dc:creator>Urban_Science</dc:creator>
      <dc:date>2019-07-30T22:22:22Z</dc:date>
    </item>
    <item>
      <title>Re: How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578098#M7929</link>
      <description>&lt;P&gt;Hi Urban_Science&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for the thorough attempt at assisting with this. My original diagram looks very similar to yours and I have essentially done the same thing as you (with some extra bits of mapping in the code).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The issue I am having is that your second node "&lt;SPAN&gt;Keep only Test Data", if you select Exported Data in the properties, should show that it is exporting both a training set (in this case the raw data) and the test set. I am trying to find a way to drop the training output from the&amp;nbsp;"Keep only Test Data" SAS Code node.&amp;nbsp;If your code isn't generating a training set then I have no idea what I am doing wrong - I am using the code node without modification apart from the code.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;After messing around with deleting and re-adding the nodes a few times, and plenty of manual updating, I have found that Miner takes the training set of the first node connected, provided the node is updated before connecting the second (test) data. So I have figured out a way to ensure the correct training set gets used. While not ideal, it works, so that is what I am doing for now.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks again for taking the time to answer this.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2019 14:22:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578098#M7929</guid>
      <dc:creator>PrestickNinja</dc:creator>
      <dc:date>2019-07-31T14:22:33Z</dc:date>
    </item>
    <item>
      <title>Re: How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578131#M7930</link>
      <description>Good news, I have replicated the issue where "Keep only Test Data" passes training data too. I'll try some things over lunch to see what I can find.</description>
      <pubDate>Wed, 31 Jul 2019 16:00:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578131#M7930</guid>
      <dc:creator>Urban_Science</dc:creator>
      <dc:date>2019-07-31T16:00:00Z</dc:date>
    </item>
    <item>
      <title>Re: How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578133#M7931</link>
      <description>&lt;P&gt;I think I solved it.&amp;nbsp; Connect the nodes to a "Data Append" node.&amp;nbsp; Then in the properties of the "Data Append" node, click on "..." for the Data Selector property.&amp;nbsp; Then change Use to No for the TRAIN role data coming from the Code Node.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2019 16:09:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578133#M7931</guid>
      <dc:creator>Urban_Science</dc:creator>
      <dc:date>2019-07-31T16:09:28Z</dc:date>
    </item>
    <item>
      <title>Re: How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578282#M7933</link>
      <description>Ah, great. That sounds like it should fix the issue. I will give it a try as soon as I get in to work</description>
      <pubDate>Thu, 01 Aug 2019 06:36:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578282#M7933</guid>
      <dc:creator>PrestickNinja</dc:creator>
      <dc:date>2019-08-01T06:36:41Z</dc:date>
    </item>
    <item>
      <title>Re: How do I drop data sources in Enterprise Miner, or choose one? (multiple training sets created)</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578289#M7934</link>
      <description>&lt;P&gt;That works perfectly - thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 08:03:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-do-I-drop-data-sources-in-Enterprise-Miner-or-choose-one/m-p/578289#M7934</guid>
      <dc:creator>PrestickNinja</dc:creator>
      <dc:date>2019-08-01T08:03:21Z</dc:date>
    </item>
  </channel>
</rss>

