<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Creating a 'less specific' model for a rare event in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Creating-a-less-specific-model-for-a-rare-event/m-p/511550#M7485</link>
    <description>&lt;P&gt;Misclassification tables can be very misleading in rare event scenarios.&amp;nbsp; &amp;nbsp;Those tables are typically built using either the default target profile (most likely outcome is the prediction) or a weighted outcome based on decision weights which you have entered (most valuable outcome is the one predicted).&amp;nbsp; &amp;nbsp;In practice, you should look at the choosing a threshold for your decision after looking at how the model performs taking into consideration the different types of error you might make (e.g. is it more problematic to predict an 'event' as a 'non-event' or vice-versa?).&amp;nbsp; &amp;nbsp;Your choice of the 'best' cutoff can change depending on your goal and the risk/reward associated with each outcome.&amp;nbsp; &amp;nbsp; There is a good thread discussing some of your options at&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Data-Mining-and-Machine/A-Question-on-Modeling-Rare-Events-Data/m-p/374048#M5561" target="_self"&gt;https://communities.sas.com/t5/SAS-Data-Mining-and-Machine/A-Question-on-Modeling-Rare-Events-Data/m-p/374048#M5561&amp;nbsp;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps!&lt;/P&gt;
&lt;P&gt;Doug&lt;/P&gt;</description>
    <pubDate>Thu, 08 Nov 2018 21:50:40 GMT</pubDate>
    <dc:creator>DougWielenga</dc:creator>
    <dc:date>2018-11-08T21:50:40Z</dc:date>
    <item>
      <title>Creating a 'less specific' model for a rare event</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Creating-a-less-specific-model-for-a-rare-event/m-p/508619#M7462</link>
      <description>&lt;P&gt;Morning all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is my first foray into the world of predictive modelling.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm attempting to predict the prevalence of an event that occurs in roughly 1% of my dataset - 1400 events in 118000 dataset.&amp;nbsp;'L' being 'Large customers' in a known data set of small/medium/large customers.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've set this as a binary target and set ordering to ascending so that it attempts to predict the rare event.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Firstly is this the correct way to approach things, or should I be manipulating the prior probabilities and oversampling instead?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. The difficulty I'm having is that the models are trying&amp;nbsp;to be 'too predictive', for example am I right in thinking the attached matrix suggests I would 'lose' 206 large customers to every 34 I can correctly&amp;nbsp;predict&amp;nbsp;correctly?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My preference would be, if I were providing my sales guys a list of clients, to improve on chance at 1 in 100 being large&amp;nbsp;to something like 7 in 100 whilst 'losing' the smallest number of potential leads.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I hope this makes sense, if you need any more clarity please let me know.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using Enterprise Miner 14.1 and I've been looking at mainly Decision Trees and Regression models.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;F&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Classification matrix.png" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/24480iC437200A7F5A3DA9/image-size/large?v=v2&amp;amp;px=999" role="button" title="Classification matrix.png" alt="Classification matrix.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 30 Oct 2018 09:50:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Creating-a-less-specific-model-for-a-rare-event/m-p/508619#M7462</guid>
      <dc:creator>F_Clay</dc:creator>
      <dc:date>2018-10-30T09:50:32Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a 'less specific' model for a rare event</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Creating-a-less-specific-model-for-a-rare-event/m-p/511550#M7485</link>
      <description>&lt;P&gt;Misclassification tables can be very misleading in rare event scenarios.&amp;nbsp; &amp;nbsp;Those tables are typically built using either the default target profile (most likely outcome is the prediction) or a weighted outcome based on decision weights which you have entered (most valuable outcome is the one predicted).&amp;nbsp; &amp;nbsp;In practice, you should look at the choosing a threshold for your decision after looking at how the model performs taking into consideration the different types of error you might make (e.g. is it more problematic to predict an 'event' as a 'non-event' or vice-versa?).&amp;nbsp; &amp;nbsp;Your choice of the 'best' cutoff can change depending on your goal and the risk/reward associated with each outcome.&amp;nbsp; &amp;nbsp; There is a good thread discussing some of your options at&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Data-Mining-and-Machine/A-Question-on-Modeling-Rare-Events-Data/m-p/374048#M5561" target="_self"&gt;https://communities.sas.com/t5/SAS-Data-Mining-and-Machine/A-Question-on-Modeling-Rare-Events-Data/m-p/374048#M5561&amp;nbsp;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps!&lt;/P&gt;
&lt;P&gt;Doug&lt;/P&gt;</description>
      <pubDate>Thu, 08 Nov 2018 21:50:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Creating-a-less-specific-model-for-a-rare-event/m-p/511550#M7485</guid>
      <dc:creator>DougWielenga</dc:creator>
      <dc:date>2018-11-08T21:50:40Z</dc:date>
    </item>
  </channel>
</rss>

