<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Doubts about oversampling in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Doubts-about-oversampling/m-p/71401#M442</link>
    <description>I am trying to predict a rare event, I read about using oversampling with the sampling node both on the following link and on EM's Help. &lt;BR /&gt;
&lt;A href="http://support.sas.com/kb/24/205.html" target="_blank"&gt;http://support.sas.com/kb/24/205.html&lt;/A&gt;&lt;BR /&gt;
&lt;BR /&gt;
The link says that I'm not supposed to adjust frecuency for oversampling but EM's help says I should. My intention is to make a model and then score a large database with the resulting model, Should I adjust the frecuency for oversampling or not? &lt;BR /&gt;
&lt;BR /&gt;
I tried both approaches, the cumulative lift and even some of the resulting independent variables are very different.</description>
    <pubDate>Tue, 15 Sep 2009 12:05:50 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2009-09-15T12:05:50Z</dc:date>
    <item>
      <title>Doubts about oversampling</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Doubts-about-oversampling/m-p/71401#M442</link>
      <description>I am trying to predict a rare event, I read about using oversampling with the sampling node both on the following link and on EM's Help. &lt;BR /&gt;
&lt;A href="http://support.sas.com/kb/24/205.html" target="_blank"&gt;http://support.sas.com/kb/24/205.html&lt;/A&gt;&lt;BR /&gt;
&lt;BR /&gt;
The link says that I'm not supposed to adjust frecuency for oversampling but EM's help says I should. My intention is to make a model and then score a large database with the resulting model, Should I adjust the frecuency for oversampling or not? &lt;BR /&gt;
&lt;BR /&gt;
I tried both approaches, the cumulative lift and even some of the resulting independent variables are very different.</description>
      <pubDate>Tue, 15 Sep 2009 12:05:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Doubts-about-oversampling/m-p/71401#M442</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-09-15T12:05:50Z</dc:date>
    </item>
    <item>
      <title>Re: Doubts about oversampling</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Doubts-about-oversampling/m-p/71402#M443</link>
      <description>Hi, &lt;BR /&gt;
&lt;BR /&gt;
what Enterprise Miner version are you using? In Enterprise Miner 5.x, do not select the "adjust frequency for oversampling" check box as it offsets the level-based sampling / over-sampling. To my mind, you can either use the level-based sampling approach to over-sampling OR the adjust frequency approach to over-sampling. I use diagrams like this one in EM 5.3:  &lt;BR /&gt;
&lt;BR /&gt;
Input Data Source (_without_ a target profile)&lt;BR /&gt;
&amp;gt;&lt;BR /&gt;
Sample Node (with level-based sampling, no frequency adjustment)&lt;BR /&gt;
&amp;gt;&lt;BR /&gt;
Decision node (create an appropriate target profile to reflect the true priors)&lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
[...]&lt;BR /&gt;
&lt;BR /&gt;
Cheers,&lt;BR /&gt;
Karsten</description>
      <pubDate>Thu, 17 Sep 2009 18:07:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Doubts-about-oversampling/m-p/71402#M443</guid>
      <dc:creator>Karsten_SAS</dc:creator>
      <dc:date>2009-09-17T18:07:11Z</dc:date>
    </item>
  </channel>
</rss>

