<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Trouble with Clustering Node in Model Studio in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805846#M9154</link>
    <description>Thank you very much. I ended up using PROC KCLUS for, as far as I can tell, it replicates Model Studio's Clustering node.</description>
    <pubDate>Mon, 04 Apr 2022 13:08:23 GMT</pubDate>
    <dc:creator>mevargasm</dc:creator>
    <dc:date>2022-04-04T13:08:23Z</dc:date>
    <item>
      <title>Trouble with Clustering Node in Model Studio</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805358#M9143</link>
      <description>&lt;P&gt;Hi! I'm working with the Model Studio Clustering node to segment a small database of 70 rows and 35 columns. Except for the ID, all columns are interval variables that were previously standardized. My pipeline is extremely simple and looks like this:&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="mevargasm_0-1648743949472.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/70008i33D9E81598293DD2/image-size/medium?v=v2&amp;amp;px=400" role="button" title="mevargasm_0-1648743949472.png" alt="mevargasm_0-1648743949472.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The results from the clustering node shows 5 clusters as the optimal number:&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="mevargasm_1-1648744012851.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/70009i7B7BB42191C80627/image-size/medium?v=v2&amp;amp;px=400" role="button" title="mevargasm_1-1648744012851.png" alt="mevargasm_1-1648744012851.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, when exporting the data (wither from the Otput Data tab of the node, or using a Sve Data or Score Data), all rows display a null value in the _CLUSTER_ID_:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="mevargasm_2-1648744185438.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/70010iABB2DA3C59467FFE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="mevargasm_2-1648744185438.png" alt="mevargasm_2-1648744185438.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What could be causing the issue?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Mar 2022 16:30:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805358#M9143</guid>
      <dc:creator>mevargasm</dc:creator>
      <dc:date>2022-03-31T16:30:20Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble with Clustering Node in Model Studio</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805378#M9144</link>
      <description>&lt;P&gt;I have moved this post to 'Data Mining and Machine Learning' board (where it belongs).&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Thu, 31 Mar 2022 18:42:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805378#M9144</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-03-31T18:42:07Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble with Clustering Node in Model Studio</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805383#M9145</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would have to investigate.&lt;BR /&gt;What you see is weird and not normal behaviour of course.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But before reproducing (or trying to) in Model Studio, ... this question or remark :&lt;/P&gt;
&lt;P&gt;The Model Studio VDMML clustering is built for big data. I'm not sure if it will react well on ( only ! ) 70 records with 35 variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I would do the same, I would do it with a procedure (or with a task in SAS Studio).&lt;BR /&gt;Procedures that you can use are :&lt;/P&gt;
&lt;UL class="lia-list-style-type-square"&gt;
&lt;LI&gt;PROC FASTCLUS (k-means)&lt;/LI&gt;
&lt;LI&gt;PROC CLUSTER (hierarchical clustering)&lt;/LI&gt;
&lt;LI&gt;PROC KCLUS (k-means and possibility to find out about "best" &lt;EM&gt;k&lt;/EM&gt; with ABC criterion)&lt;/LI&gt;
&lt;LI&gt;PROC HPCLUS (High-Performance k-means clustering)&lt;/LI&gt;
&lt;LI&gt;PROC MODECLUS&amp;nbsp;&lt;SPAN&gt;finds disjoint clusters of observations with coordinate or distance data by using nonparametric density estimation&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;clustering with the&amp;nbsp;Nonparametric Bayes Action Set (action nonParametricBayes.gmm) in PROC CAS. You can also use the GMM Procedure here!&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;PROC MBC for&amp;nbsp;Model-Based Clustering&lt;BR /&gt;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;SPAN&gt;Good luck,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Koen&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Mar 2022 18:54:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805383#M9145</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-03-31T18:54:55Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble with Clustering Node in Model Studio</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805846#M9154</link>
      <description>Thank you very much. I ended up using PROC KCLUS for, as far as I can tell, it replicates Model Studio's Clustering node.</description>
      <pubDate>Mon, 04 Apr 2022 13:08:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Trouble-with-Clustering-Node-in-Model-Studio/m-p/805846#M9154</guid>
      <dc:creator>mevargasm</dc:creator>
      <dc:date>2022-04-04T13:08:23Z</dc:date>
    </item>
  </channel>
</rss>

