<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using cross-validation in Enterprise Miner; in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233639#M3315</link>
    <description>&lt;P&gt;Thank you Miguel !&lt;/P&gt;</description>
    <pubDate>Sat, 07 Nov 2015 16:27:15 GMT</pubDate>
    <dc:creator>frak</dc:creator>
    <dc:date>2015-11-07T16:27:15Z</dc:date>
    <item>
      <title>Using cross-validation in Enterprise Miner;</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233635#M3313</link>
      <description>&lt;P&gt;Hi, this is my problem:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've got a dataset on which i've to apply the technics of data mining.&lt;/P&gt;&lt;P&gt;Because my&lt;SPAN&gt;&amp;nbsp;dataset is small (about 700 obs.)&lt;/SPAN&gt;&amp;nbsp;i thought not to use external validation, but cross-validation (10 fold);&lt;/P&gt;&lt;P&gt;Although i don't now how to make it on Enterprise Miner:&lt;/P&gt;&lt;P&gt;I mean, when i specify some models i can't find the way to tell Enterprise Miner to use cross-validation.&lt;/P&gt;&lt;P&gt;I E&amp;nbsp;in trees models there is the explicit option "execute crossvalidation yes/no" but in other type models i can't find it.&lt;/P&gt;&lt;P&gt;So someone can help me to use cross-validation with regression and MBR models?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I hope i've written something you can understand.&lt;/P&gt;&lt;P&gt;Thank you&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 07 Nov 2015 16:02:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233635#M3313</guid>
      <dc:creator>frak</dc:creator>
      <dc:date>2015-11-07T16:02:23Z</dc:date>
    </item>
    <item>
      <title>Re: Using cross-validation in Enterprise Miner;</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233638#M3314</link>
      <description>&lt;P&gt;Hi Frak,&lt;/P&gt;
&lt;P&gt;You are right, three&amp;nbsp;nodes that support crossvalidation directly are:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;SPAN style="line-height: 20px;"&gt;Decision tree&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN style="line-height: 20px;"&gt;LARS regression&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN style="line-height: 20px;"&gt;HPTree&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;For the rest of the model nodes (including some High-Performance model nodes), you can use the Start and End group nodes. As long as the node that you are using produces SAS score code, you can use it. This means most of the nodes (HP Regression, HP Tree, HPNeural, etc) will work well with this approach, but not HPForest or some settings of HPSVM.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Check out:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The Power of the Group Processing Facility in SAS® Enterprise Miner™, Sascha Schubert, SAS Institute Inc., Cary, NC&lt;/P&gt;
&lt;P&gt;&lt;A href="https://support.sas.com/resources/papers/proceedings10/123-2010.pdf" target="_blank"&gt;https://support.sas.com/resources/papers/proceedings10/123-2010.pdf&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="line-height: 20px;"&gt;This paper walks you through all the cool things you can do with the Start and End group nodes. Bottomline, you need to use a Transform node to create a crossvalidation variable. Include that, and you are pretty much setup!&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck!&lt;/P&gt;
&lt;P&gt;Miguel&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;[Edit: I added HPTree to the list. You can do crossvalidation cost-complexity when you do not have a Partition node]&lt;/P&gt;</description>
      <pubDate>Sat, 07 Nov 2015 16:56:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233638#M3314</guid>
      <dc:creator>M_Maldonado</dc:creator>
      <dc:date>2015-11-07T16:56:50Z</dc:date>
    </item>
    <item>
      <title>Re: Using cross-validation in Enterprise Miner;</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233639#M3315</link>
      <description>&lt;P&gt;Thank you Miguel !&lt;/P&gt;</description>
      <pubDate>Sat, 07 Nov 2015 16:27:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233639#M3315</guid>
      <dc:creator>frak</dc:creator>
      <dc:date>2015-11-07T16:27:15Z</dc:date>
    </item>
    <item>
      <title>Re: Using cross-validation in Enterprise Miner;</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233875#M3316</link>
      <description>&lt;P&gt;Daaa Daan I'm back again, and i've new dubts..&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As Miguel, (thank you again Miguel!)&amp;nbsp;recommend me, I used Start and End node to obtain cross-validation.&lt;/P&gt;
&lt;P&gt;So, as expected ,using a 10 fold crossvalidation, i obtained 11 different dataset (10 with 9/10 of data and 1 complete), for each of&amp;nbsp;which EM calculated a model. That's OK.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BUT, in my studies I learnt (maybe) that also in k-fold cross-validation i've finally a validation dataset, which is the result of the "sum" of scores of each model (created on (k-1)/k of data) on the ramaining 1/k of data, AND this doesn't happen in EM: so i don't have a validation dataset.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is a big problem for me, because, as I understood, i need a validation to evaluate the presence of overfiffing in my model, don't I? How could i tell if there's overfitting or not only looking training datasets?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Someone could tell me what is wrong?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Again, sorry for my terrible english, and thank you in advance!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2015 17:37:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233875#M3316</guid>
      <dc:creator>frak</dc:creator>
      <dc:date>2015-11-09T17:37:52Z</dc:date>
    </item>
    <item>
      <title>Re: Using cross-validation in Enterprise Miner;</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233877#M3317</link>
      <description>&lt;P&gt;Hey bud,&lt;/P&gt;
&lt;P&gt;Glad you are back. I'm the same, I always use a test set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;At the beginning of the your flow do: Data-&amp;gt;Data Partition (70%training, 0% validation, 30%test)-&amp;gt;then your crossvalidation flow-&amp;gt;then a Model Comparison node [even if you only have one model, you are using it to see more fit statistics about your model].&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Keep up the good work!&lt;/P&gt;
&lt;P&gt;-Miguel&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2015 17:49:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233877#M3317</guid>
      <dc:creator>M_Maldonado</dc:creator>
      <dc:date>2015-11-09T17:49:16Z</dc:date>
    </item>
    <item>
      <title>Re: Using cross-validation in Enterprise Miner;</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233881#M3318</link>
      <description>&lt;P&gt;Oh, it seems you were waiting for me ahah .. so fast &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;Yes, i could use a data partitioning node but i would lose the benefits of crossvalidation: i would like not to divide&amp;nbsp;my dataset...&amp;nbsp;&lt;/P&gt;
&lt;P&gt;..in any case thanks you a lot &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2015 18:19:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-cross-validation-in-Enterprise-Miner/m-p/233881#M3318</guid>
      <dc:creator>frak</dc:creator>
      <dc:date>2015-11-09T18:19:12Z</dc:date>
    </item>
  </channel>
</rss>

