<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Different Datasets for Training/Testing and Validation in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Different-Datasets-for-Training-Testing-and-Validation/m-p/275264#M4093</link>
    <description>Hi, &lt;BR /&gt;First, you don't really need two nodes as indicated in your post. You can just  drag the validation data set and go to the panel to the left and  change it to Validate. Second, yes you need  to engage Score node because the nature of your goal is to assess. So 1. delete the Assign Role node. 2. Change data set to Validate. 3. connect both the validation data set AND the DT node to a Score node. Then connect the Score to a Model comparison node. Jason Xin</description>
    <pubDate>Sun, 05 Jun 2016 20:23:01 GMT</pubDate>
    <dc:creator>JasonXin</dc:creator>
    <dc:date>2016-06-05T20:23:01Z</dc:date>
    <item>
      <title>Different Datasets for Training/Testing and Validation</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Different-Datasets-for-Training-Testing-and-Validation/m-p/274897#M4086</link>
      <description>&lt;P&gt;Hello Everbody,&lt;BR /&gt;&lt;BR /&gt;I'm trying to use two different datasets for a model, i.e. &lt;STRONG&gt;training/testing&lt;/STRONG&gt; and &lt;STRONG&gt;validation&lt;/STRONG&gt;. Please see the picture below:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/3475i326DD53E60A42CF2/image-size/original?v=v2&amp;amp;px=-1" alt="Test_Validate_problem.JPG" title="Test_Validate_problem.JPG" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;As you can see, I partitioned my Raw dataset (after having assigned variable roles target , input, etc.) into&lt;STRONG&gt; 70% training and 30% testing&lt;/STRONG&gt;. Also, I have a second dataset called "Validation" which I assigned the role of "Validate".&amp;nbsp; &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regarding the model (here: a decision tree) I now want Enterprise Miner (version 12.1) to use the partitioned "training" dataset to set up a model and use the "test" partition to test it. AFTERWARDS I WANT THE GENERATED MODEL TO BE VALIDATED ON THE SECOND DATASET ("Validation"). There, however, I only have left the the target variable, an ID variable and another variable I assigned the role of "Rejected":&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/3474iB738E5F04A930D77/image-size/original?v=v2&amp;amp;px=-1" alt="varsummary_validation.JPG" title="varsummary_validation.JPG" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When I run this model I get the following error:&lt;BR /&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/3473iEE69859146A2522F/image-size/original?v=v2&amp;amp;px=-1" alt="error_message.JPG" title="error_message.JPG" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What am I doing wrong? Do I first have to use a "Score" node after the decission tree node?&lt;BR /&gt;&lt;BR /&gt;Any suggestion would be appreciated.&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;BR /&gt;&lt;BR /&gt;Felix&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jun 2016 09:04:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Different-Datasets-for-Training-Testing-and-Validation/m-p/274897#M4086</guid>
      <dc:creator>FK</dc:creator>
      <dc:date>2016-06-03T09:04:44Z</dc:date>
    </item>
    <item>
      <title>Re: Different Datasets for Training/Testing and Validation</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Different-Datasets-for-Training-Testing-and-Validation/m-p/275264#M4093</link>
      <description>Hi, &lt;BR /&gt;First, you don't really need two nodes as indicated in your post. You can just  drag the validation data set and go to the panel to the left and  change it to Validate. Second, yes you need  to engage Score node because the nature of your goal is to assess. So 1. delete the Assign Role node. 2. Change data set to Validate. 3. connect both the validation data set AND the DT node to a Score node. Then connect the Score to a Model comparison node. Jason Xin</description>
      <pubDate>Sun, 05 Jun 2016 20:23:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Different-Datasets-for-Training-Testing-and-Validation/m-p/275264#M4093</guid>
      <dc:creator>JasonXin</dc:creator>
      <dc:date>2016-06-05T20:23:01Z</dc:date>
    </item>
  </channel>
</rss>

