<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SAS EMiner Oversampling reduced the traget sample size in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/316081#M4751</link>
    <description>Hi, 

First of  all, there is no over-sampling node in EM. I figure you meant Sample Node. The Sample Node has random, systematic, First, N, stratify... None of them allows you to change the ration between 1 and 0 on the target. The purpose of sampling is to take a subset, in one way or another, to represent the master source. The goal is to represent, not to alter. On the other hand, the matter of oversampling is to recompose a sample, therefore to alter, logically. Sampling Node often is used in situation like : The qualified model universe has 20 million observations. I need to take 5% sample to make it work in EM. In this sense, sampling really is not analytical/technical. But oversampling is every bit of analytics. In other words, the reason you run sampling should not overlap with that driving oversampling, although the act of oversampling per SE is sampling. Hope this helps? Thank you for using SAS. Best. Jason Xin</description>
    <pubDate>Thu, 01 Dec 2016 21:12:52 GMT</pubDate>
    <dc:creator>JasonXin</dc:creator>
    <dc:date>2016-12-01T21:12:52Z</dc:date>
    <item>
      <title>SAS EMiner Oversampling reduced the traget sample size</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/312410#M4691</link>
      <description>&lt;P&gt;I used EMnier to do oversampling. Th original target variable is binary with proportion as below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;BR /&gt;Variable&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Value&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Count&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Percent&amp;nbsp;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;Target&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;0&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 252&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 32.7273&lt;BR /&gt;Target&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 518&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 67.2727&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After Oversampling, I got:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Data=SAMPLE&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;BR /&gt;Variable&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Value&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Count&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Percent&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;Target&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;252&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 50&lt;BR /&gt;Target&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 252&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 50&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The sample size Target='1' is reduced from 518 to 252. This is not the result I want.&lt;/P&gt;&lt;P&gt;I want to increase the target ='0' sample size from 252 to 518.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone know hot to solove this problem?&lt;/P&gt;&lt;P&gt;Any suggestion is appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Nov 2016 18:06:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/312410#M4691</guid>
      <dc:creator>beibeiwhy</dc:creator>
      <dc:date>2016-11-17T18:06:11Z</dc:date>
    </item>
    <item>
      <title>Re: SAS EMiner Oversampling reduced the traget sample size</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/312715#M4701</link>
      <description>I'm new to EMiner. Really need help with this.&lt;BR /&gt;Thanks a lot</description>
      <pubDate>Fri, 18 Nov 2016 20:01:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/312715#M4701</guid>
      <dc:creator>beibeiwhy</dc:creator>
      <dc:date>2016-11-18T20:01:31Z</dc:date>
    </item>
    <item>
      <title>Re: SAS EMiner Oversampling reduced the traget sample size</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/312818#M4704</link>
      <description>Hi, 

In EM, see attached picture. Once you load the data into EM, the YES group (in the picture) should be 1 in your case and NO group should be your 0 group. Count=999 should be your 518 and Count=967 should be your 252. To the right, in replace of 0.5081, enter 1. In replace of 0.4919, enter 2.055555556 (=518/252). In plain English, doing so you are telling EM to treat the 518 1 group as it is. And treat the 252 0 group as if there are 2.055555556*252~518. Logically. 

Hope this helps? 
Jason Xin

&lt;BR /&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/13159i49719ABD5C08B0FD/image-size/large?v=1.0&amp;amp;px=600" border="0" alt="priorsem.jpg" title="priorsem.jpg" /&gt;</description>
      <pubDate>Sat, 19 Nov 2016 16:58:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/312818#M4704</guid>
      <dc:creator>JasonXin</dc:creator>
      <dc:date>2016-11-19T16:58:28Z</dc:date>
    </item>
    <item>
      <title>Re: SAS EMiner Oversampling reduced the traget sample size</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/314868#M4737</link>
      <description>Hi Jason, your reply is very helpful. So using this prior decision, I don't need to use the oversampling node any more?</description>
      <pubDate>Mon, 28 Nov 2016 16:12:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/314868#M4737</guid>
      <dc:creator>beibeiwhy</dc:creator>
      <dc:date>2016-11-28T16:12:58Z</dc:date>
    </item>
    <item>
      <title>Re: SAS EMiner Oversampling reduced the traget sample size</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/316081#M4751</link>
      <description>Hi, 

First of  all, there is no over-sampling node in EM. I figure you meant Sample Node. The Sample Node has random, systematic, First, N, stratify... None of them allows you to change the ration between 1 and 0 on the target. The purpose of sampling is to take a subset, in one way or another, to represent the master source. The goal is to represent, not to alter. On the other hand, the matter of oversampling is to recompose a sample, therefore to alter, logically. Sampling Node often is used in situation like : The qualified model universe has 20 million observations. I need to take 5% sample to make it work in EM. In this sense, sampling really is not analytical/technical. But oversampling is every bit of analytics. In other words, the reason you run sampling should not overlap with that driving oversampling, although the act of oversampling per SE is sampling. Hope this helps? Thank you for using SAS. Best. Jason Xin</description>
      <pubDate>Thu, 01 Dec 2016 21:12:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SAS-EMiner-Oversampling-reduced-the-traget-sample-size/m-p/316081#M4751</guid>
      <dc:creator>JasonXin</dc:creator>
      <dc:date>2016-12-01T21:12:52Z</dc:date>
    </item>
  </channel>
</rss>

