<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Binning and Pre-Binning in Interactive Grouping in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674660#M8365</link>
    <description>&lt;P&gt;Thanks Ksharp. If I understand correctly we can use either quantile, bucket OR tree method for binning? Is that correct?&amp;nbsp;&lt;/P&gt;&lt;P&gt;The documentation&amp;nbsp;states that quantile/bucket binning is a pre-bin stage before a Tree based method can be applied:&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;DIV class="xis-paragraph"&gt;"The&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="xis-windowItem"&gt;Interactive Grouping&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;node&lt;FONT color="#FF0000"&gt; first performs binning&lt;/FONT&gt; on the interval characteristic. You can choose between two binning methods: quantile and bucket. The quantile method generates groups. The groups are formed by ranked quantities with approximately the same frequency in each group. The bucket method generates groups by dividing the data into evenly spaced intervals that are based on the difference between the maximum and minimum values.&lt;/DIV&gt;&lt;DIV class="xis-paragraph"&gt;&lt;FONT color="#FF0000"&gt;After the interval variables have been pre-binned&lt;/FONT&gt;, a decision tree model is fitted for each characteristic.&lt;SPAN&gt;&amp;nbsp;"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/BLOCKQUOTE&gt;&lt;DIV class="xis-paragraph"&gt;&lt;SPAN&gt;So is tree binning a sequential process starting with quantile/bucket pre-binning or we can use quantile, bucket and tree as alternative binning methods?&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;</description>
    <pubDate>Wed, 05 Aug 2020 07:52:28 GMT</pubDate>
    <dc:creator>ronya</dc:creator>
    <dc:date>2020-08-05T07:52:28Z</dc:date>
    <item>
      <title>Binning and Pre-Binning in Interactive Grouping</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674327#M8361</link>
      <description>&lt;P&gt;Hello all, I'm&amp;nbsp;exploring the use of interactive grouping in SaS EMiner as a method to bin the values of interval characteristic and wish to ask about the pre-binning process. Why do we need to use&amp;nbsp;the&amp;nbsp;quantile or bucket method to pre-bin the interval variable values rather than apply Tree-based binning to the interval values directly?&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 04 Aug 2020 12:03:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674327#M8361</guid>
      <dc:creator>ronya</dc:creator>
      <dc:date>2020-08-04T12:03:58Z</dc:date>
    </item>
    <item>
      <title>Re: Binning and Pre-Binning in Interactive Grouping</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674341#M8362</link>
      <description>quantile or bucket method are simple and easy to use.&lt;BR /&gt;If you are using Credit ScoreCard ,Tree-based binning  can't guarantee the woe is monotonic .</description>
      <pubDate>Tue, 04 Aug 2020 12:59:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674341#M8362</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2020-08-04T12:59:18Z</dc:date>
    </item>
    <item>
      <title>Re: Binning and Pre-Binning in Interactive Grouping</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674660#M8365</link>
      <description>&lt;P&gt;Thanks Ksharp. If I understand correctly we can use either quantile, bucket OR tree method for binning? Is that correct?&amp;nbsp;&lt;/P&gt;&lt;P&gt;The documentation&amp;nbsp;states that quantile/bucket binning is a pre-bin stage before a Tree based method can be applied:&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;DIV class="xis-paragraph"&gt;"The&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="xis-windowItem"&gt;Interactive Grouping&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;node&lt;FONT color="#FF0000"&gt; first performs binning&lt;/FONT&gt; on the interval characteristic. You can choose between two binning methods: quantile and bucket. The quantile method generates groups. The groups are formed by ranked quantities with approximately the same frequency in each group. The bucket method generates groups by dividing the data into evenly spaced intervals that are based on the difference between the maximum and minimum values.&lt;/DIV&gt;&lt;DIV class="xis-paragraph"&gt;&lt;FONT color="#FF0000"&gt;After the interval variables have been pre-binned&lt;/FONT&gt;, a decision tree model is fitted for each characteristic.&lt;SPAN&gt;&amp;nbsp;"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/BLOCKQUOTE&gt;&lt;DIV class="xis-paragraph"&gt;&lt;SPAN&gt;So is tree binning a sequential process starting with quantile/bucket pre-binning or we can use quantile, bucket and tree as alternative binning methods?&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 05 Aug 2020 07:52:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674660#M8365</guid>
      <dc:creator>ronya</dc:creator>
      <dc:date>2020-08-05T07:52:28Z</dc:date>
    </item>
    <item>
      <title>Re: Binning and Pre-Binning in Interactive Grouping</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674691#M8366</link>
      <description>I think  quantile, bucket and tree are just three bin methods , you can use one of them .&lt;BR /&gt;Someone more like Tree , Someone more like quantile.&lt;BR /&gt;&lt;BR /&gt;You could bin many groups like 20 by  quantile, bucket method, and merge any two groups into one group to make Chisquare or Gini max , and so on , I think that is a tree method.</description>
      <pubDate>Wed, 05 Aug 2020 10:57:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674691#M8366</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2020-08-05T10:57:57Z</dc:date>
    </item>
    <item>
      <title>Re: Binning and Pre-Binning in Interactive Grouping</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674738#M8367</link>
      <description>&lt;P&gt;Someone from SAS may be able to provide a more accurate respose, but, as far as I know, the algorithm behind the Interactive Grouping Node uses a two-step approach for interval variables:&lt;/P&gt;
&lt;P&gt;1. First, it "discretizes" the variables by creating groups, essentially transforming the variables from interval to nominal&lt;/P&gt;
&lt;P&gt;2. Secondly, applies a Tree-based logic to find the optimal binning based on the groups from step (1)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My understanding is the above approach is used only for computational efficiency reasons, because, in general, interval variables may have hundreds, if not, thousands of different values whihc would make it too computational intensive for a Tree algorithm to fully evaluate.&lt;/P&gt;
&lt;P&gt;Therefore, by carrying out a pre-binning step, you end up with far fewer categories which then can be optimised based on a Tree-like algorithm.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Lastly, from my experience, unless you have a good reason for using "bucket", my advice is to always go for "quantile" (i.e. that should be the default approach unless, for some specific reason, you want to have groups defined by having the same width).&lt;/P&gt;</description>
      <pubDate>Wed, 05 Aug 2020 14:42:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/674738#M8367</guid>
      <dc:creator>pvareschi</dc:creator>
      <dc:date>2020-08-05T14:42:58Z</dc:date>
    </item>
    <item>
      <title>Re: Binning and Pre-Binning in Interactive Grouping</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/675255#M8369</link>
      <description>Yes, that is all correct!</description>
      <pubDate>Fri, 07 Aug 2020 15:12:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/675255#M8369</guid>
      <dc:creator>WendyCzika</dc:creator>
      <dc:date>2020-08-07T15:12:13Z</dc:date>
    </item>
    <item>
      <title>Re: Binning and Pre-Binning in Interactive Grouping</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/675468#M8370</link>
      <description>&lt;P&gt;Many thanks for the detailed response and recommendation. I've had a chance to run the 2-stage process and see the binning/grouping process and their coarse/fine views&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 09 Aug 2020 10:46:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Binning-and-Pre-Binning-in-Interactive-Grouping/m-p/675468#M8370</guid>
      <dc:creator>ronya</dc:creator>
      <dc:date>2020-08-09T10:46:24Z</dc:date>
    </item>
  </channel>
</rss>

