<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Unsupervised clustering in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516824#M7533</link>
    <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For some reason the article doesn't work. However I have pasted the unsupervised clustering section below. I still do not get what is its purpose as the &amp;nbsp;PCA and relevant variables were already achieved for these high risk group from the decision tree/ variable of importance chart..&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Unsupervised Clustering: Subgroup Analysis of&lt;BR /&gt;High-Risk Patients&lt;BR /&gt;We used principle component analysis [21] to reduce high&lt;BR /&gt;dimensional EMR features and identify clinically relevant&lt;BR /&gt;groups of patients of high risk for 6-month ED visit with similar&lt;BR /&gt;patterns of demographics, primary diagnosis and procedure,&lt;BR /&gt;and chronic disease conditions. The features for high-risk&lt;BR /&gt;patients were projected to a lower dimensional subspace with&lt;BR /&gt;largest variances. The K-means algorithm was applied to find&lt;BR /&gt;potential patient patterns for future 6-month ED visit [22]. We&lt;BR /&gt;used K=6 to generate the final six clusters. The technical details&lt;BR /&gt;are described in Multimedia Appendix 9. Clustering patterns&lt;BR /&gt;between retrospective and prospective cohorts were compared&lt;BR /&gt;to further validate our high-risk case finding algorithm. As part&lt;BR /&gt;of the health care management platform, our predictive model&lt;BR /&gt;was integrated onto a Web-based dashboard to provide a&lt;BR /&gt;real-time visualization of the population profile with ED&lt;BR /&gt;6-month visits.&lt;/P&gt;</description>
    <pubDate>Wed, 28 Nov 2018 18:52:53 GMT</pubDate>
    <dc:creator>chuie</dc:creator>
    <dc:date>2018-11-28T18:52:53Z</dc:date>
    <item>
      <title>Unsupervised clustering</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516509#M7531</link>
      <description>&lt;P&gt;Hi There,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I found this diagram and article&amp;nbsp; where they did a statistical modeling and figure out the high risk group and then did a unsupervised clustering.&lt;/P&gt;&lt;P&gt;So&amp;nbsp; I am not sure what is the point of doing unsupervised clustering as&amp;nbsp; we already know what are the features(variables importance, nodes etc)&amp;nbsp; that high&amp;nbsp; risk group entails thru the statistical modeling.&lt;/P&gt;&lt;P&gt;This is a great article but couldn't understand the logic behind it .&lt;/P&gt;&lt;P&gt;Please help&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;C&lt;/P&gt;&lt;P&gt;&lt;A href="https://yougottabelieve.info/case-control-study-vs-cohort-study-retrospective&amp;nbsp;" target="_blank"&gt;https://yougottabelieve.info/case-control-study-vs-cohort-study-retrospective&amp;nbsp;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="clusterPNG.PNG" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/25207iDDB669D93249989D/image-size/large?v=v2&amp;amp;px=999" role="button" title="clusterPNG.PNG" alt="clusterPNG.PNG" /&gt;&lt;/span&gt; found this article/a&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 27 Nov 2018 22:30:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516509#M7531</guid>
      <dc:creator>chuie</dc:creator>
      <dc:date>2018-11-27T22:30:34Z</dc:date>
    </item>
    <item>
      <title>Re: Unsupervised clustering</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516527#M7532</link>
      <description>Link doesn't work. From the diagram it looks like the High Risk group was used for the unsupervised clustering - and this is usually done to tell us what we don't know. Yes, we know some variable importance, but exactly how that falls out for this subgroup may be different. Unsupervised clustering may be counter to what we expect so it's a good step to go through to either confirm or reject assumptions.</description>
      <pubDate>Tue, 27 Nov 2018 23:41:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516527#M7532</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-11-27T23:41:31Z</dc:date>
    </item>
    <item>
      <title>Re: Unsupervised clustering</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516824#M7533</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For some reason the article doesn't work. However I have pasted the unsupervised clustering section below. I still do not get what is its purpose as the &amp;nbsp;PCA and relevant variables were already achieved for these high risk group from the decision tree/ variable of importance chart..&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Unsupervised Clustering: Subgroup Analysis of&lt;BR /&gt;High-Risk Patients&lt;BR /&gt;We used principle component analysis [21] to reduce high&lt;BR /&gt;dimensional EMR features and identify clinically relevant&lt;BR /&gt;groups of patients of high risk for 6-month ED visit with similar&lt;BR /&gt;patterns of demographics, primary diagnosis and procedure,&lt;BR /&gt;and chronic disease conditions. The features for high-risk&lt;BR /&gt;patients were projected to a lower dimensional subspace with&lt;BR /&gt;largest variances. The K-means algorithm was applied to find&lt;BR /&gt;potential patient patterns for future 6-month ED visit [22]. We&lt;BR /&gt;used K=6 to generate the final six clusters. The technical details&lt;BR /&gt;are described in Multimedia Appendix 9. Clustering patterns&lt;BR /&gt;between retrospective and prospective cohorts were compared&lt;BR /&gt;to further validate our high-risk case finding algorithm. As part&lt;BR /&gt;of the health care management platform, our predictive model&lt;BR /&gt;was integrated onto a Web-based dashboard to provide a&lt;BR /&gt;real-time visualization of the population profile with ED&lt;BR /&gt;6-month visits.&lt;/P&gt;</description>
      <pubDate>Wed, 28 Nov 2018 18:52:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516824#M7533</guid>
      <dc:creator>chuie</dc:creator>
      <dc:date>2018-11-28T18:52:53Z</dc:date>
    </item>
    <item>
      <title>Re: Unsupervised clustering</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516825#M7534</link>
      <description>&amp;gt;The technical details&lt;BR /&gt;are described in Multimedia Appendix 9&lt;BR /&gt;Do you have access to that?</description>
      <pubDate>Wed, 28 Nov 2018 18:58:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516825#M7534</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-11-28T18:58:24Z</dc:date>
    </item>
    <item>
      <title>Re: Unsupervised clustering</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516857#M7535</link>
      <description>&lt;P&gt;it just explain how to do it not why &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;********************************************&lt;/P&gt;&lt;P&gt;Multimedia Appendix 9. Unsupervised clustering of high r isk population using&lt;BR /&gt;PCA.&lt;BR /&gt;To reduce high dimensional EMR features for detecting cohort pat tern, we used&lt;BR /&gt;principle component analysis (PCA) to divide the high r isk patients of future 6-&lt;BR /&gt;month ED visit identified by our algorithm in the prospective cohort into distinctive&lt;BR /&gt;groups, based on demographics, primary diagnosis and procedure, and chronic&lt;BR /&gt;disease conditions. The features for high-r isk patients are projected to a lower&lt;BR /&gt;dimensional subspace with largest variances.&lt;BR /&gt;Where Xi is EMR feature mat rix for each high-r isk patient, and wk is the set of&lt;BR /&gt;vectors of weights that map each patient feature vector Xi to a new vector of&lt;BR /&gt;principal component scores Ti&lt;BR /&gt;k. And we computed w1 by solving following objective&lt;BR /&gt;functions (1) and (2) and wk by i terating objective function (3) based on the first k-1&lt;BR /&gt;principal components,&lt;BR /&gt;And then K-means algorithm was applied on the top of principal components Ti&lt;BR /&gt;k&lt;BR /&gt;subspace of PCA to find potential patient patterns for future 6-month ED visit. We&lt;BR /&gt;used K=6 to implement init ial k means set for the algorithm and calculate the&lt;BR /&gt;Euclidean centroid m to generate finial clusters,&lt;BR /&gt;Where Ci is the ith cluster in total 6 clusters, and x represents the previous principal&lt;BR /&gt;components Tk.&lt;BR /&gt;Unique patterns revealed by the clustering results were analyzed to characterize&lt;BR /&gt;the high-r isk subjects identified by our ED algorithm. Unique patterns revealed by&lt;BR /&gt;the clustering results were analyzed to characterize the high-r isk subjects identified&lt;BR /&gt;by our ED algorithm.&lt;BR /&gt;1&lt;/P&gt;</description>
      <pubDate>Wed, 28 Nov 2018 20:18:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Unsupervised-clustering/m-p/516857#M7535</guid>
      <dc:creator>chuie</dc:creator>
      <dc:date>2018-11-28T20:18:57Z</dc:date>
    </item>
  </channel>
</rss>

