<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Client segmentation algorithm in banking using SAS EG/EM in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876749#M10501</link>
    <description>&lt;P&gt;Hello &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/432808"&gt;@KJazem&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I must unfortunately contradict &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/77218"&gt;@GuyTreepwood&lt;/a&gt;&amp;nbsp;.&lt;BR /&gt;The CLUSTER node in SAS Enterprise Miner does NOT do full-fledged hierarchical clustering on all observations (for big data, that would be an extremely challenging task). Hierarchical clustering in EM CLUSTER node is only an intermediate step to estimate the "best" number of clusters.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The Cluster node in Enterprise Miner (latest version is 15.2) is doing K-MEANS clustering!!&lt;/P&gt;
&lt;P&gt;Hierarchical clustering is just an intermediate step to determine the best number of clusters.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is how the CLUSTER node (in the Explore Group) works ... when you do not change the defaults :&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;k-means is done with k=50 (preliminary maximum)&lt;/LI&gt;
&lt;LI&gt;Then the 50 multivariate mean vectors are clustered with WARD (agglomerative) hierarchical clustering method&lt;/LI&gt;
&lt;LI&gt;Then the best number of clusters is determined (minimum=2 , final maximum=20). Let's say best = 8 !&lt;/LI&gt;
&lt;LI&gt;Then a k-means is done again on the full dataset with k=8.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;You can also use the "HP Cluster" node in the HPDM group of nodes (HPDM = High-Performance Data Mining).&lt;/P&gt;
&lt;P&gt;The "HP Cluster" node is running PROC HPCLUS in the background.&amp;nbsp;&lt;SPAN&gt;The&amp;nbsp;&lt;/SPAN&gt;&lt;FONT&gt;HPCLUS&lt;/FONT&gt;&lt;SPAN&gt;&amp;nbsp;procedure is a high-performance procedure that performs k-means clustering.&lt;BR /&gt;&lt;/SPAN&gt;And that "HP Cluster" node (PROC HPCLUS) is finding the number of clusters (the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;k&lt;/EM&gt;) using the&amp;nbsp;&lt;SPAN&gt;aligned box criterion (ABC) method (and NOT via that foray into hierarchical clustering).&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;In VIYA PROC HPCLUS evolved into PROC KCLUS.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Via the "Open Source Integration Node" in SAS EM, you can also apply "Spectral Clustering" to your data!&lt;/P&gt;
&lt;P&gt;Via the "SAS Code Node" in SAS EM, you can also apply PROC MODECLUS to your data!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="xisDoc-refProc"&gt;
&lt;DIV id="statug_introclus000019" class="aa-section"&gt;
&lt;DIV class="aa-deflist"&gt;
&lt;DL class="aa-deflist"&gt;
&lt;DT&gt;&lt;SPAN class=" aa-term "&gt;MODECLUS&lt;/SPAN&gt;&lt;/DT&gt;
&lt;DD&gt;
&lt;P class="xisDoc-paraSimpleFirst"&gt;finds disjoint clusters of observations with coordinate or distance data by using nonparametric density estimation. It can also perform approximate nonparametric significance tests for the number of clusters.&lt;/P&gt;
&lt;/DD&gt;
&lt;/DL&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;SPAN&gt;Good luck,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Koen&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 20 May 2023 11:51:32 GMT</pubDate>
    <dc:creator>sbxkoenk</dc:creator>
    <dc:date>2023-05-20T11:51:32Z</dc:date>
    <item>
      <title>Client segmentation algorithm in banking using SAS EG/EM</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876721#M10499</link>
      <description>&lt;P&gt;I want to implement a segmentation methodology for a bank for their business banking clients - SMEs, MSBs, etc. The type of data they have includes: client level data (client industry, current status (active/inactive), what branch they opened their accounts, etc.), product holding information (what products they hold, product activation date/tenure, interest and fee income in the last 2 years, etc.), bank-to-bank transactions, POS billing, and more. The types of products include: POS, payment gateways, credit, debit and prepaid cards, fixed deposit accounts, interest bearing accounts, insurance account, trade finance (letter of credit and letter of guarantee), etc.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The client has both SAS EG and SAS EM. I wanted to know, from anyone's experience here, what the best clustering technique for this use-base would be. I have very little experience with SAS EM, but am I correct in assuming it supports the most common clustering algorithms - k-means, SOMs, hierarchical, etc.?&amp;nbsp; Note that retail customers are completely excluded in this use-case.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any guidance would be appreciated. Please move this accordingly if it doesn't fit here.&lt;/P&gt;</description>
      <pubDate>Fri, 19 May 2023 19:03:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876721#M10499</guid>
      <dc:creator>KJazem</dc:creator>
      <dc:date>2023-05-19T19:03:39Z</dc:date>
    </item>
    <item>
      <title>Re: Client segmentation algorithm in banking using SAS EG/EM</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876743#M10500</link>
      <description>Hello,&lt;BR /&gt;&lt;BR /&gt;For SAS EM, the Cluster node should do k-means and hierarchical clustering, using the Centroid and Ward options, respectively, under the Clustering Method menu. For SOM, there is the SOM/Kohenen node.&lt;BR /&gt;&lt;BR /&gt;You can find the Cluster node documentation here: &lt;A href="https://documentation.sas.com/doc/en/emref/14.3/n1vjatb74dundbn12d2ecb09juak.htm" target="_blank"&gt;https://documentation.sas.com/doc/en/emref/14.3/n1vjatb74dundbn12d2ecb09juak.htm&lt;/A&gt; &lt;BR /&gt;&lt;BR /&gt;and the SOM/Kohonen here:  &lt;A href="https://documentation.sas.com/doc/en/emref/14.3/n0978xngiafo2ln1mpj80trq36qk.htm" target="_blank"&gt;https://documentation.sas.com/doc/en/emref/14.3/n0978xngiafo2ln1mpj80trq36qk.htm&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;You can perform hierarchical and k-means clustering in as well EG using PROC CLUSTER, setting the method= option  to either Centroid to Ward. &lt;BR /&gt;&lt;BR /&gt;Hope this helps.</description>
      <pubDate>Sat, 20 May 2023 08:38:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876743#M10500</guid>
      <dc:creator>GuyTreepwood</dc:creator>
      <dc:date>2023-05-20T08:38:58Z</dc:date>
    </item>
    <item>
      <title>Re: Client segmentation algorithm in banking using SAS EG/EM</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876749#M10501</link>
      <description>&lt;P&gt;Hello &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/432808"&gt;@KJazem&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I must unfortunately contradict &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/77218"&gt;@GuyTreepwood&lt;/a&gt;&amp;nbsp;.&lt;BR /&gt;The CLUSTER node in SAS Enterprise Miner does NOT do full-fledged hierarchical clustering on all observations (for big data, that would be an extremely challenging task). Hierarchical clustering in EM CLUSTER node is only an intermediate step to estimate the "best" number of clusters.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The Cluster node in Enterprise Miner (latest version is 15.2) is doing K-MEANS clustering!!&lt;/P&gt;
&lt;P&gt;Hierarchical clustering is just an intermediate step to determine the best number of clusters.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is how the CLUSTER node (in the Explore Group) works ... when you do not change the defaults :&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;k-means is done with k=50 (preliminary maximum)&lt;/LI&gt;
&lt;LI&gt;Then the 50 multivariate mean vectors are clustered with WARD (agglomerative) hierarchical clustering method&lt;/LI&gt;
&lt;LI&gt;Then the best number of clusters is determined (minimum=2 , final maximum=20). Let's say best = 8 !&lt;/LI&gt;
&lt;LI&gt;Then a k-means is done again on the full dataset with k=8.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;You can also use the "HP Cluster" node in the HPDM group of nodes (HPDM = High-Performance Data Mining).&lt;/P&gt;
&lt;P&gt;The "HP Cluster" node is running PROC HPCLUS in the background.&amp;nbsp;&lt;SPAN&gt;The&amp;nbsp;&lt;/SPAN&gt;&lt;FONT&gt;HPCLUS&lt;/FONT&gt;&lt;SPAN&gt;&amp;nbsp;procedure is a high-performance procedure that performs k-means clustering.&lt;BR /&gt;&lt;/SPAN&gt;And that "HP Cluster" node (PROC HPCLUS) is finding the number of clusters (the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;k&lt;/EM&gt;) using the&amp;nbsp;&lt;SPAN&gt;aligned box criterion (ABC) method (and NOT via that foray into hierarchical clustering).&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;In VIYA PROC HPCLUS evolved into PROC KCLUS.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Via the "Open Source Integration Node" in SAS EM, you can also apply "Spectral Clustering" to your data!&lt;/P&gt;
&lt;P&gt;Via the "SAS Code Node" in SAS EM, you can also apply PROC MODECLUS to your data!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="xisDoc-refProc"&gt;
&lt;DIV id="statug_introclus000019" class="aa-section"&gt;
&lt;DIV class="aa-deflist"&gt;
&lt;DL class="aa-deflist"&gt;
&lt;DT&gt;&lt;SPAN class=" aa-term "&gt;MODECLUS&lt;/SPAN&gt;&lt;/DT&gt;
&lt;DD&gt;
&lt;P class="xisDoc-paraSimpleFirst"&gt;finds disjoint clusters of observations with coordinate or distance data by using nonparametric density estimation. It can also perform approximate nonparametric significance tests for the number of clusters.&lt;/P&gt;
&lt;/DD&gt;
&lt;/DL&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;SPAN&gt;Good luck,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Koen&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 20 May 2023 11:51:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876749#M10501</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2023-05-20T11:51:32Z</dc:date>
    </item>
    <item>
      <title>Re: Client segmentation algorithm in banking using SAS EG/EM</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876750#M10502</link>
      <description>&lt;P&gt;On top of previous reply, I add this note :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The best way, in my opinion, to assess the quality of your clustering solution is the Silhouette Coefficient.&lt;/P&gt;
&lt;P&gt;(you do ultimately want heterogeneity between clusters and homogeneity within clusters)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here are 3 useful articles / blogs :&lt;/P&gt;
&lt;UL class="lia-list-style-type-square"&gt;
&lt;LI&gt;Paper 3409-2019&lt;BR /&gt;How to Evaluate Different Clustering Results?&lt;BR /&gt;Ralph Abbey, SAS Institute Inc. &lt;BR /&gt;&lt;A href="https://support.sas.com/resources/papers/proceedings19/3409-2019.pdf" target="_blank"&gt;https://support.sas.com/resources/papers/proceedings19/3409-2019.pdf&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;What is the silhouette statistic in cluster analysis?&lt;BR /&gt;By Rick Wicklin on The DO Loop May 15, 2023&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2023/05/15/silhouette-statistic-cluster.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2023/05/15/silhouette-statistic-cluster.html&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Compute the silhouette statistic in SAS &lt;BR /&gt;By Rick Wicklin on The DO Loop May 17, 2023&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2023/05/17/compute-silhouette-sas.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2023/05/17/compute-silhouette-sas.html&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;If you do not have SAS/IML (PROC IML) in your license, then you should calculate Silhouette coefficient with a macro that uses PROC DISTANCE and PROC MEANS and some data steps.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Sat, 20 May 2023 12:01:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876750#M10502</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2023-05-20T12:01:10Z</dc:date>
    </item>
    <item>
      <title>Re: Client segmentation algorithm in banking using SAS EG/EM</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876767#M10503</link>
      <description>These are very helpful, thank you for the references. A couple of follow-up questions: 1) Would you say K-means clustering works best with customer segmentation? We have many features so just want to see which works best - K-means, SOM, etc. and 2) Is the Silhouette coefficient the best metric to evaluate any clustering algorithm or specifically K-means?&lt;BR /&gt;&lt;BR /&gt;Thanks for the help!</description>
      <pubDate>Sat, 20 May 2023 18:18:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876767#M10503</guid>
      <dc:creator>KJazem</dc:creator>
      <dc:date>2023-05-20T18:18:17Z</dc:date>
    </item>
    <item>
      <title>Re: Client segmentation algorithm in banking using SAS EG/EM</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876777#M10504</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/432808"&gt;@KJazem&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;1) Would you say K-means clustering works best with customer segmentation? We have many features so just want to see which works best - K-means, SOM, etc. and &lt;BR /&gt;2) Is the Silhouette coefficient the best metric to evaluate any clustering algorithm or specifically K-means?&lt;/BLOCKQUOTE&gt;
&lt;P&gt;1) Hierarchical clustering (like done with PROC CLUSTER) is superior to k-means disjoint clustering in general, but with tens of thousands of customers and many features, it can take many hours for calculations to finish.&lt;BR /&gt;Also, you might need to transform the data before clustering (same for k-means by the way).&lt;/P&gt;
&lt;DIV class="xisDoc-refProc"&gt;
&lt;DIV id="statug_aceclus000786" class="aa-section"&gt;
&lt;P class="xisDoc-paragraph"&gt;For example, you can use the ACECLUS procedure to obtain approximate estimates of the pooled within-cluster covariance matrix and to compute canonical variables for subsequent analysis. You use PROC ACECLUS to preprocess data before you cluster it by using the CLUSTER procedure.&lt;BR /&gt;PROC CLUSTER has many Clustering Methods (ultrametric and others) you can try out.&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2) Silhouette coefficient is the best metric to evaluate any clustering solution no matter which algorithm was used to establish the clustering solution.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;
&lt;DIV id="ConnectiveDocSignExtentionInstalled" data-extension-version="1.0.4"&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Sun, 21 May 2023 01:33:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876777#M10504</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2023-05-21T01:33:03Z</dc:date>
    </item>
    <item>
      <title>Re: Client segmentation algorithm in banking using SAS EG/EM</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876922#M10505</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/432808"&gt;@KJazem&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For inspiration, you can also look here :&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.lexjansen.com/search/searchresults.php?q=%22customer%20segmentation%22" target="_blank"&gt;https://www.lexjansen.com/search/searchresults.php?q=%22customer%20segmentation%22&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="lia-link-navigation child-thread lia-link-disabled" aria-disabled="true" aria-label="SAS Tip: Learn lexjansen.com"&gt;[[&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="lia-link-navigation child-thread lia-link-disabled" aria-disabled="true" aria-label="SAS Tip: Learn lexjansen.com"&gt;SAS Tip: Learn lexjansen.com&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="lia-link-navigation child-thread lia-link-disabled" aria-disabled="true" aria-label="SAS Tip: Learn lexjansen.com"&gt;&lt;A href="https://communities.sas.com/t5/SAS-Tips-from-the-Community/SAS-Tip-Learn-lexjansen-com/td-p/436336" target="_blank"&gt;https://communities.sas.com/t5/SAS-Tips-from-the-Community/SAS-Tip-Learn-lexjansen-com/td-p/436336&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="lia-link-navigation child-thread lia-link-disabled" aria-disabled="true" aria-label="SAS Tip: Learn lexjansen.com"&gt;]]&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Mon, 22 May 2023 16:17:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Client-segmentation-algorithm-in-banking-using-SAS-EG-EM/m-p/876922#M10505</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2023-05-22T16:17:58Z</dc:date>
    </item>
  </channel>
</rss>

