<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Adjust the distribution of a feature with sampling? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196475#M2561</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Jon,&lt;/P&gt;&lt;P&gt;A way to do it directly in EM:&lt;/P&gt;&lt;P&gt;On your Data Partition node, click on the Variables ellipsis (...). On the menu you can specify a Partition Role as Stratification.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;e.g.&lt;/P&gt;&lt;P&gt;Home Equity IDS-&amp;gt;Partition (change Partition Role of 'Reason' from Default to Stratification)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;IMG __jive_id="9494" alt="forsascomm4.png" class="jive-image" src="https://communities.sas.com/legacyfs/online/9494_forsascomm4.png" /&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope this helps,&lt;/P&gt;&lt;P&gt;Miguel&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Sat, 07 Mar 2015 15:47:34 GMT</pubDate>
    <dc:creator>M_Maldonado</dc:creator>
    <dc:date>2015-03-07T15:47:34Z</dc:date>
    <item>
      <title>Adjust the distribution of a feature with sampling?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196472#M2558</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;So I intend to build a predictive model.&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;I have a large &lt;/SPAN&gt;data set&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt; with 10 features and 1 interval target. One of those features is numeric ranging from 1 to 900. I know that due to some underlying changes in the population, records from about 1 to 250 are underrepresented in my sample, and 251+ are over represented. I approximately know what the distribution of this &lt;/SPAN&gt;feature&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt; should look like.&amp;nbsp; Is there a way I can easily sample from dataset with replacement so that the distribution of this feature matches percentages I give it? &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 06 Mar 2015 20:13:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196472#M2558</guid>
      <dc:creator>JonB_</dc:creator>
      <dc:date>2015-03-06T20:13:06Z</dc:date>
    </item>
    <item>
      <title>Re: Adjust the distribution of a feature with sampling?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196473#M2559</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;One way would be to add a strata variable based on whether the value is over/under the given break point. I don't know if EM has a direct sampling tool but Proc surveyselect allows setting a sample rate per strata.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 06 Mar 2015 23:17:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196473#M2559</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2015-03-06T23:17:26Z</dc:date>
    </item>
    <item>
      <title>Re: Adjust the distribution of a feature with sampling?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196474#M2560</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I ended up breaking my data into several segments depending and the value of my numeric feature, and used proc surveyselect to sample with replacement form the individual pieces until the overall distribution of my data looked as I expected it to. Thanks!&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 07 Mar 2015 15:16:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196474#M2560</guid>
      <dc:creator>JonB_</dc:creator>
      <dc:date>2015-03-07T15:16:26Z</dc:date>
    </item>
    <item>
      <title>Re: Adjust the distribution of a feature with sampling?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196475#M2561</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Jon,&lt;/P&gt;&lt;P&gt;A way to do it directly in EM:&lt;/P&gt;&lt;P&gt;On your Data Partition node, click on the Variables ellipsis (...). On the menu you can specify a Partition Role as Stratification.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;e.g.&lt;/P&gt;&lt;P&gt;Home Equity IDS-&amp;gt;Partition (change Partition Role of 'Reason' from Default to Stratification)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;IMG __jive_id="9494" alt="forsascomm4.png" class="jive-image" src="https://communities.sas.com/legacyfs/online/9494_forsascomm4.png" /&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope this helps,&lt;/P&gt;&lt;P&gt;Miguel&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 07 Mar 2015 15:47:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Adjust-the-distribution-of-a-feature-with-sampling/m-p/196475#M2561</guid>
      <dc:creator>M_Maldonado</dc:creator>
      <dc:date>2015-03-07T15:47:34Z</dc:date>
    </item>
  </channel>
</rss>

