<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PROC SMOTESAMPLE in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/PROC-SMOTESAMPLE/m-p/972980#M377620</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;I'm currently using the smotesample action in SAS and encountering two unusual issues:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P&gt;&lt;STRONG&gt;Unexpected Categorical Values&lt;/STRONG&gt;:&lt;BR /&gt;Some of the newly generated categorical variables—are returning large negative values (e.g., -1000000) that do not exist in the original dataset. Could you clarify why this might happen? Specifically, how does SMOTE handle categorical features, and is there a recommended approach to ensure realistic synthetic values?&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;&lt;STRONG&gt;Amplification of Placeholder Values&lt;/STRONG&gt;:&lt;BR /&gt;My dataset uses placeholder values like -1 or -2 to represent missing data. After applying SMOTE, these placeholders seem to become more dominant in the sampled data. Is there a way to prevent SMOTE from oversampling these values or to exclude them from the interpolation process?&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Any guidance or best practices would be greatly appreciated.&lt;/P&gt;</description>
    <pubDate>Wed, 20 Aug 2025 14:16:08 GMT</pubDate>
    <dc:creator>phopkinson</dc:creator>
    <dc:date>2025-08-20T14:16:08Z</dc:date>
    <item>
      <title>PROC SMOTESAMPLE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SMOTESAMPLE/m-p/972980#M377620</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;I'm currently using the smotesample action in SAS and encountering two unusual issues:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P&gt;&lt;STRONG&gt;Unexpected Categorical Values&lt;/STRONG&gt;:&lt;BR /&gt;Some of the newly generated categorical variables—are returning large negative values (e.g., -1000000) that do not exist in the original dataset. Could you clarify why this might happen? Specifically, how does SMOTE handle categorical features, and is there a recommended approach to ensure realistic synthetic values?&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;&lt;STRONG&gt;Amplification of Placeholder Values&lt;/STRONG&gt;:&lt;BR /&gt;My dataset uses placeholder values like -1 or -2 to represent missing data. After applying SMOTE, these placeholders seem to become more dominant in the sampled data. Is there a way to prevent SMOTE from oversampling these values or to exclude them from the interpolation process?&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Any guidance or best practices would be greatly appreciated.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Aug 2025 14:16:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SMOTESAMPLE/m-p/972980#M377620</guid>
      <dc:creator>phopkinson</dc:creator>
      <dc:date>2025-08-20T14:16:08Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SMOTESAMPLE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SMOTESAMPLE/m-p/972983#M377622</link>
      <description>&lt;P&gt;Please include your SAS log that shows the code you are running and any messages you are getting. If you can provide sample data that would also be helpful.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Aug 2025 15:36:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SMOTESAMPLE/m-p/972983#M377622</guid>
      <dc:creator>Kathryn_SAS</dc:creator>
      <dc:date>2025-08-20T15:36:57Z</dc:date>
    </item>
  </channel>
</rss>

