<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SEMMA in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434897#M6700</link>
    <description>&lt;P&gt;Thank for your answer Padraic. The reason I asked is because I never came across (in my non-exhaustive search) work where the sampling was not performed straight on the raw imported full data-set. Nicolas&lt;/P&gt;</description>
    <pubDate>Wed, 07 Feb 2018 15:25:56 GMT</pubDate>
    <dc:creator>NicolasC</dc:creator>
    <dc:date>2018-02-07T15:25:56Z</dc:date>
    <item>
      <title>SEMMA</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434783#M6694</link>
      <description>&lt;P&gt;Hi there&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I may have what sounds like a stupid question but in SEMMA methodology, why sampling is first?&lt;/P&gt;&lt;P&gt;In other words, if I first manipulate my large data (imputing missing values/binning interval data etc...) and then after perform a sampling on this data to create my model is that complete non-sense?&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Nicolas&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 08:48:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434783#M6694</guid>
      <dc:creator>NicolasC</dc:creator>
      <dc:date>2018-02-07T08:48:47Z</dc:date>
    </item>
    <item>
      <title>Re: SEMMA</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434878#M6697</link>
      <description>&lt;P&gt;Your approach is fine. I came up with "SEMMA" as an easily remembered guide for those who have little analytical experience.&amp;nbsp; People with analytical experience will do what they know best.&lt;/P&gt;
&lt;P&gt;-Padraic&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 14:51:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434878#M6697</guid>
      <dc:creator>PadraicGNeville</dc:creator>
      <dc:date>2018-02-07T14:51:10Z</dc:date>
    </item>
    <item>
      <title>Re: SEMMA</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434897#M6700</link>
      <description>&lt;P&gt;Thank for your answer Padraic. The reason I asked is because I never came across (in my non-exhaustive search) work where the sampling was not performed straight on the raw imported full data-set. Nicolas&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 15:25:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434897#M6700</guid>
      <dc:creator>NicolasC</dc:creator>
      <dc:date>2018-02-07T15:25:56Z</dc:date>
    </item>
    <item>
      <title>Re: SEMMA</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434921#M6702</link>
      <description>&lt;P&gt;Well, if there is one sample to do the analysis and another sample held out to evaluate the results of the analysis, and missing values are imputed using all the data, then the&amp;nbsp;evaluation data is not completely independent if there are a bunch of missing values.&amp;nbsp; So, better practice, and practice simpler to explain and possibly avoid criticisms of the results,&amp;nbsp; is to impute &amp;amp; bin on each sample separately.&amp;nbsp; &amp;nbsp;&lt;/P&gt;
&lt;P&gt;That said, it often doesn't matter.&amp;nbsp; It's an art.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 15:51:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/SEMMA/m-p/434921#M6702</guid>
      <dc:creator>PadraicGNeville</dc:creator>
      <dc:date>2018-02-07T15:51:38Z</dc:date>
    </item>
  </channel>
</rss>

