<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic SAS Miner - Impute new scoring data? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-Miner-Impute-new-scoring-data/m-p/411099#M6265</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm relatively new to SAS Miner so please forgive me if this is a really stupid question!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I created a model which uses 18 variables in it, of which 9 are&amp;nbsp;imputed variables, due to some columns having a high proportion of nulls.&amp;nbsp; Based on the scoring outputs of the test partition, the&amp;nbsp;created model&amp;nbsp;looked to be fairly predictive (50% of those which had the outcome I was trying to predict, featured in the 10% of scores, c80% in the top 20%).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, when I came to score some brand new data, whilst the top decile still performed ok (c.50% of the top decile had the outcome I was trying to predict vs. an overall 28%), there were large swathes of records with identical model scores which means some of the "middle" deciles are not performing as expected as they are smeared in the middle.&amp;nbsp; There are similar levels of nulls in this data too and it is these nulls and the fact that I have used imputed columns in the model creation which prompts my question.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When scoring new data - does Miner factor in the previously used impute, or do I need to feed the new data through an impute before scoring too?&amp;nbsp;&amp;nbsp; If the former - could something else be wrong which is causing my issue?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is my model diagram - the model is the flow on the right, from data all the way to scoring the test partition.&amp;nbsp; The flow on the left (starting highlighted yellow), is the new data I'm trying to score.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help, very gratefully received.&amp;nbsp; Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="model capture.JPG" style="width: 346px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/16486iEC8DF7F36ECB0C1A/image-size/large?v=v2&amp;amp;px=999" role="button" title="model capture.JPG" alt="model capture.JPG" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 07 Nov 2017 09:35:18 GMT</pubDate>
    <dc:creator>giant_wolf00</dc:creator>
    <dc:date>2017-11-07T09:35:18Z</dc:date>
    <item>
      <title>SAS Miner - Impute new scoring data?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-Miner-Impute-new-scoring-data/m-p/411099#M6265</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm relatively new to SAS Miner so please forgive me if this is a really stupid question!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I created a model which uses 18 variables in it, of which 9 are&amp;nbsp;imputed variables, due to some columns having a high proportion of nulls.&amp;nbsp; Based on the scoring outputs of the test partition, the&amp;nbsp;created model&amp;nbsp;looked to be fairly predictive (50% of those which had the outcome I was trying to predict, featured in the 10% of scores, c80% in the top 20%).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, when I came to score some brand new data, whilst the top decile still performed ok (c.50% of the top decile had the outcome I was trying to predict vs. an overall 28%), there were large swathes of records with identical model scores which means some of the "middle" deciles are not performing as expected as they are smeared in the middle.&amp;nbsp; There are similar levels of nulls in this data too and it is these nulls and the fact that I have used imputed columns in the model creation which prompts my question.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When scoring new data - does Miner factor in the previously used impute, or do I need to feed the new data through an impute before scoring too?&amp;nbsp;&amp;nbsp; If the former - could something else be wrong which is causing my issue?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is my model diagram - the model is the flow on the right, from data all the way to scoring the test partition.&amp;nbsp; The flow on the left (starting highlighted yellow), is the new data I'm trying to score.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help, very gratefully received.&amp;nbsp; Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="model capture.JPG" style="width: 346px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/16486iEC8DF7F36ECB0C1A/image-size/large?v=v2&amp;amp;px=999" role="button" title="model capture.JPG" alt="model capture.JPG" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Nov 2017 09:35:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-Miner-Impute-new-scoring-data/m-p/411099#M6265</guid>
      <dc:creator>giant_wolf00</dc:creator>
      <dc:date>2017-11-07T09:35:18Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Miner - Impute new scoring data?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-Miner-Impute-new-scoring-data/m-p/411699#M6279</link>
      <description>&lt;P&gt;The Score node contains the imputation scoring code that was passed to it by the Impute node.&amp;nbsp; When new data is passed to the Score node for scoring, the imputation code is applied automatically to the new data.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can see exactly what the Score node code is going to do by viewing the score code itself.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;- After the Score node finishes running, right-click the Score node, and select Results.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;- In the Results window, select View -&amp;gt; Scoring -&amp;gt; SAS Code.&amp;nbsp; This is the code that is used to score new data.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Nov 2017 21:13:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-Miner-Impute-new-scoring-data/m-p/411699#M6279</guid>
      <dc:creator>MikeStockstill</dc:creator>
      <dc:date>2017-11-08T21:13:15Z</dc:date>
    </item>
  </channel>
</rss>

