<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188412#M2324</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You got what you needed, good to go? How does your tree beat a default tree?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 04 Dec 2014 14:52:28 GMT</pubDate>
    <dc:creator>M_Maldonado</dc:creator>
    <dc:date>2014-12-04T14:52:28Z</dc:date>
    <item>
      <title>Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188404#M2316</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;What is the best and most efficient way to save a Tree Diagram in Enterprise Miner (EM) and apply it to all 100% of the data for final results? I wish to keep my nodes static as much as I can, and as easily as possible.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am starting out using an 80/20 split. This might move closer to 60/40, but we will see. On Monday I think we will have our model finalized. Then I would like it applied to 100% of my original data.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It also will help if EM generates code to be fully utilized by Enterprise Guide. Below is an example of some of the code generated by EM:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; Node = 166&lt;/P&gt;&lt;P&gt;*------------------------------------------------------------*&lt;/P&gt;&lt;P&gt;if PURE_PREMIUM &amp;gt;= 5684.5 or MISSING&lt;/P&gt;&lt;P&gt;AND PAYROLL &amp;lt; 693492&lt;/P&gt;&lt;P&gt;AND HAZARD_CODE &amp;lt;= D&lt;/P&gt;&lt;P&gt;AND Business Unit IS ONE OF: 2, 3 or MISSING&lt;/P&gt;&lt;P&gt;AND BLEND_GROSS_LOAD2 &amp;gt;= 149 or MISSING&lt;/P&gt;&lt;P&gt;AND BLEND_GROSS_LOAD1 &amp;lt; 40.5 or MISSING&lt;/P&gt;&lt;P&gt;then &lt;/P&gt;&lt;P&gt; Tree Node Identifier&amp;nbsp;&amp;nbsp; = 166&lt;/P&gt;&lt;P&gt; Number of Observations = 236&lt;/P&gt;&lt;P&gt; Predicted: D_GROSS_LOADED_WITH_TREND=1 = 0.54&lt;/P&gt;&lt;P&gt; Predicted: D_GROSS_LOADED_WITH_TREND=0 = 0.46&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;That does not help me much. I would like it more if it looked like something that can be used within standard SAS code. For example - If (PURE_PREMIUM &amp;lt;= 5684.5) or (PURE_PREMIUM =&amp;nbsp; .)) then ...;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Perhaps I am missing the option to create this code within EM. Thank you.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 26 Nov 2014 21:56:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188404#M2316</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2014-11-26T21:56:33Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188405#M2317</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;EM does generate the full score code that can be used in EG&lt;/P&gt;&lt;P&gt;I believe there's a score code node that generates the code, what version of EM are you on?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 27 Nov 2014 02:42:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188405#M2317</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2014-11-27T02:42:37Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188406#M2318</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I am on 13.1, and yes I found the score code node. Thank you very much.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Do you recommend I use the Optimized SAS Code or just the regular SAS Code? What are the differences?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, there are a lot of code statements in there that I am not familiar with. Do you suggest I just have my code point to my data file/library then just run all of this code that was generated? Or do you recommend anything else? Below are some of the initial statements in my code - not making a lot of sense to me right now:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;_ARBFMT_12 = PUT( BU , BEST12.);&lt;/P&gt;&lt;P&gt; %DMNORMIP( _ARBFMT_12);&lt;/P&gt;&lt;P&gt;IF _ARBFMT_12 IN ('1' ) THEN DO;&lt;/P&gt;&lt;P&gt;&amp;nbsp; IF&amp;nbsp; NOT MISSING(PURE_PREMIUM ) AND&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 5112.5 &amp;lt;= PURE_PREMIUM&amp;nbsp; THEN DO;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _ARBFMT_12 = PUT( SMNQ_D_POST_CODE , BEST12.);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; %DMNORMIP( _ARBFMT_12);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; IF _ARBFMT_12 IN ('3' ) THEN DO;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _NODE_&amp;nbsp; =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 81;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _LEAF_&amp;nbsp; =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 21;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;May I ask for an example of how the Score Code node is incorporated with EG? The reason that I am a little nervous is because I plan on applying this code to a new datset with the same variables. I am much more accustomed to simple code.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you again.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 28 Nov 2014 19:13:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188406#M2318</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2014-11-28T19:13:24Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188407#M2319</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It shouldn't matter which version of the code you use. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Some of the stuff at the top is transformations that may have occurred in various steps of the analysis.&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To use this in EG create a program as follows and that should do what you want. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Data Score;&lt;/P&gt;&lt;P&gt;set &amp;lt;your data&amp;gt;;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;lt;insert code from Enterprise Miner&amp;gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 28 Nov 2014 21:22:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188407#M2319</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2014-11-28T21:22:25Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188408#M2320</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you again for all of the valuable advice.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But I have a huge problem. My diagram is rather simple. I have a data node, a data partition node, then a decision tree node. One of my colleagues and I were "interactively" changing the results of the decision tree mode to be more in alignment with what our results should substantively say. Ultimately we did it using the interactive feature of the decision tree node, then we closed it. We changed a few more things and ultimately re-ran the decision tree node to sort of start over again. But it also re-ran the data partition as well - we cannot figure out why.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Before I fully implement the code generator I would like to somehow &lt;SPAN style="text-decoration: underline;"&gt;lock-down&lt;/SPAN&gt; the other nodes. Is this possible with EM to &lt;SPAN style="text-decoration: underline;"&gt;insure&lt;/SPAN&gt; that nothing will change?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Dec 2014 15:57:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188408#M2320</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2014-12-03T15:57:28Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188409#M2321</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;All Enterprise Miner nodes just re-run the part that you need. For example if you change a property under the Report section, only the code that is involved in reporting will re-run.&lt;/P&gt;&lt;P&gt;In your example, if you don't change any properties on the Data Partition node, the green wheel might make it look like its running, but it is just checking that some tables or results exist. It is not re-running the whole thing.&lt;/P&gt;&lt;P&gt;I would need to double check, but I think that if you are using Interactive mode to grow your tree you have to save it, and close it, and not change any property. If you are going to re-run stuff I would suggest you to turn on the property Use Frozen Tree to Yes.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As a bonus, challenge your interactive tree with some other trees. I would use the following and compare their subtree assessment plots and their fit statistics with a Model Comparison node:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Largest tree just to confirm that the Largest is an overtrained model.&lt;/LI&gt;&lt;LI&gt;Default tree (maxdepth 6)&lt;/LI&gt;&lt;LI&gt;Tree with maximum depth 10&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Good luck,&lt;/P&gt;&lt;P&gt;Miguel&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Dec 2014 16:25:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188409#M2321</guid>
      <dc:creator>M_Maldonado</dc:creator>
      <dc:date>2014-12-03T16:25:21Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188410#M2322</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Zachary, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also check that the Rerun property is set to No for your datasource node. If it is set to Yes, that would explain why the Partition node is running each time you execute the flow. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ray&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 03 Dec 2014 21:12:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188410#M2322</guid>
      <dc:creator>rayIII</dc:creator>
      <dc:date>2014-12-03T21:12:38Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188411#M2323</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It was already set to No by default. Thank you for the suggestion there.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 04 Dec 2014 13:40:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188411#M2323</guid>
      <dc:creator>Zachary</dc:creator>
      <dc:date>2014-12-04T13:40:38Z</dc:date>
    </item>
    <item>
      <title>Re: Best Way to Finalize a Model Using 100% Data After 80/20 Training/Validation Split?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188412#M2324</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You got what you needed, good to go? How does your tree beat a default tree?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 04 Dec 2014 14:52:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Best-Way-to-Finalize-a-Model-Using-100-Data-After-80-20-Training/m-p/188412#M2324</guid>
      <dc:creator>M_Maldonado</dc:creator>
      <dc:date>2014-12-04T14:52:28Z</dc:date>
    </item>
  </channel>
</rss>

