<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Principal Component Analysis - Optimal number of retained factors in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483113#M25095</link>
    <description>&lt;P&gt;I am trying to do&amp;nbsp;dimension reduction using&amp;nbsp;Principal Component Analysis.&amp;nbsp; The dataset have 25 variables and 300K obs.&amp;nbsp;The data is for&amp;nbsp;segmentation using 2-stage clustering (K-means clustering then Linkage clustering)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What's the good practices for deciding the number retained factors. Is criteria #1 good enough?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Criteria #1:&amp;nbsp;&amp;nbsp;&lt;SPAN&gt;eigenvalue&amp;gt;1&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;-&amp;gt;&amp;nbsp; 5 Factors with 54% variation explained.&amp;nbsp; Is the variation explained too low?&amp;nbsp; Should i use&amp;nbsp;&lt;SPAN&gt;eigenvalue&amp;gt;0.7 and&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Criteria #2:&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;eigenvalue&amp;gt;0.7 and&amp;nbsp;Variation explained &amp;gt; 0.7&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-&amp;gt; 10 Factors with 78% Variation explained&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ods graphics on;

proc factor data=myData preplot plots=(scree initloadings preloadings loadings) method=principal rotate=varimax 
scree score;
var _numeric_

run;
ods graphics off;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="PC.png" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/22147i4DFC9CA066E95E5F/image-size/large?v=v2&amp;amp;px=999" role="button" title="PC.png" alt="PC.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;DIV class="branch"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="branch"&gt;&lt;BR /&gt;&lt;DIV class="branch"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class="branch"&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;#&lt;/TD&gt;&lt;TD&gt;Eigenvalue&lt;/TD&gt;&lt;TD&gt;Difference&lt;/TD&gt;&lt;TD&gt;Proportion&lt;/TD&gt;&lt;TD&gt;Cumulative&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;4.59175884&lt;/TD&gt;&lt;TD&gt;3.04985582&lt;/TD&gt;&lt;TD&gt;0.2551&lt;/TD&gt;&lt;TD&gt;0.2551&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;1.54190302&lt;/TD&gt;&lt;TD&gt;0.12646714&lt;/TD&gt;&lt;TD&gt;0.0857&lt;/TD&gt;&lt;TD&gt;0.3408&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;1.41543588&lt;/TD&gt;&lt;TD&gt;0.23647927&lt;/TD&gt;&lt;TD&gt;0.0786&lt;/TD&gt;&lt;TD&gt;0.4194&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;TD&gt;1.17895661&lt;/TD&gt;&lt;TD&gt;0.09521203&lt;/TD&gt;&lt;TD&gt;0.0655&lt;/TD&gt;&lt;TD&gt;0.4849&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;5&lt;/TD&gt;&lt;TD&gt;1.08374458&lt;/TD&gt;&lt;TD&gt;0.16097769&lt;/TD&gt;&lt;TD&gt;0.0602&lt;/TD&gt;&lt;TD&gt;0.5451&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;6&lt;/TD&gt;&lt;TD&gt;0.92276689&lt;/TD&gt;&lt;TD&gt;0.04595209&lt;/TD&gt;&lt;TD&gt;0.0513&lt;/TD&gt;&lt;TD&gt;0.5964&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;7&lt;/TD&gt;&lt;TD&gt;0.8768148&lt;/TD&gt;&lt;TD&gt;0.00522994&lt;/TD&gt;&lt;TD&gt;0.0487&lt;/TD&gt;&lt;TD&gt;0.6451&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;8&lt;/TD&gt;&lt;TD&gt;0.87158485&lt;/TD&gt;&lt;TD&gt;0.06006623&lt;/TD&gt;&lt;TD&gt;0.0484&lt;/TD&gt;&lt;TD&gt;0.6935&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;9&lt;/TD&gt;&lt;TD&gt;0.81151862&lt;/TD&gt;&lt;TD&gt;0.05330799&lt;/TD&gt;&lt;TD&gt;0.0451&lt;/TD&gt;&lt;TD&gt;0.7386&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;10&lt;/TD&gt;&lt;TD&gt;0.75821063&lt;/TD&gt;&lt;TD&gt;0.06929076&lt;/TD&gt;&lt;TD&gt;0.0421&lt;/TD&gt;&lt;TD&gt;0.7807&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;11&lt;/TD&gt;&lt;TD&gt;0.68891987&lt;/TD&gt;&lt;TD&gt;0.05741897&lt;/TD&gt;&lt;TD&gt;0.0383&lt;/TD&gt;&lt;TD&gt;0.819&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;0.6315009&lt;/TD&gt;&lt;TD&gt;0.02247774&lt;/TD&gt;&lt;TD&gt;0.0351&lt;/TD&gt;&lt;TD&gt;0.8541&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;0.60902316&lt;/TD&gt;&lt;TD&gt;0.01814238&lt;/TD&gt;&lt;TD&gt;0.0338&lt;/TD&gt;&lt;TD&gt;0.8879&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;14&lt;/TD&gt;&lt;TD&gt;0.59088079&lt;/TD&gt;&lt;TD&gt;0.03822865&lt;/TD&gt;&lt;TD&gt;0.0328&lt;/TD&gt;&lt;TD&gt;0.9207&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;</description>
    <pubDate>Wed, 01 Aug 2018 17:03:17 GMT</pubDate>
    <dc:creator>Fae</dc:creator>
    <dc:date>2018-08-01T17:03:17Z</dc:date>
    <item>
      <title>Principal Component Analysis - Optimal number of retained factors</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483113#M25095</link>
      <description>&lt;P&gt;I am trying to do&amp;nbsp;dimension reduction using&amp;nbsp;Principal Component Analysis.&amp;nbsp; The dataset have 25 variables and 300K obs.&amp;nbsp;The data is for&amp;nbsp;segmentation using 2-stage clustering (K-means clustering then Linkage clustering)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What's the good practices for deciding the number retained factors. Is criteria #1 good enough?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Criteria #1:&amp;nbsp;&amp;nbsp;&lt;SPAN&gt;eigenvalue&amp;gt;1&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;-&amp;gt;&amp;nbsp; 5 Factors with 54% variation explained.&amp;nbsp; Is the variation explained too low?&amp;nbsp; Should i use&amp;nbsp;&lt;SPAN&gt;eigenvalue&amp;gt;0.7 and&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Criteria #2:&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;eigenvalue&amp;gt;0.7 and&amp;nbsp;Variation explained &amp;gt; 0.7&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-&amp;gt; 10 Factors with 78% Variation explained&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ods graphics on;

proc factor data=myData preplot plots=(scree initloadings preloadings loadings) method=principal rotate=varimax 
scree score;
var _numeric_

run;
ods graphics off;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="PC.png" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/22147i4DFC9CA066E95E5F/image-size/large?v=v2&amp;amp;px=999" role="button" title="PC.png" alt="PC.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;DIV class="branch"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="branch"&gt;&lt;BR /&gt;&lt;DIV class="branch"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class="branch"&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;#&lt;/TD&gt;&lt;TD&gt;Eigenvalue&lt;/TD&gt;&lt;TD&gt;Difference&lt;/TD&gt;&lt;TD&gt;Proportion&lt;/TD&gt;&lt;TD&gt;Cumulative&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;4.59175884&lt;/TD&gt;&lt;TD&gt;3.04985582&lt;/TD&gt;&lt;TD&gt;0.2551&lt;/TD&gt;&lt;TD&gt;0.2551&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;1.54190302&lt;/TD&gt;&lt;TD&gt;0.12646714&lt;/TD&gt;&lt;TD&gt;0.0857&lt;/TD&gt;&lt;TD&gt;0.3408&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;1.41543588&lt;/TD&gt;&lt;TD&gt;0.23647927&lt;/TD&gt;&lt;TD&gt;0.0786&lt;/TD&gt;&lt;TD&gt;0.4194&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;TD&gt;1.17895661&lt;/TD&gt;&lt;TD&gt;0.09521203&lt;/TD&gt;&lt;TD&gt;0.0655&lt;/TD&gt;&lt;TD&gt;0.4849&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;5&lt;/TD&gt;&lt;TD&gt;1.08374458&lt;/TD&gt;&lt;TD&gt;0.16097769&lt;/TD&gt;&lt;TD&gt;0.0602&lt;/TD&gt;&lt;TD&gt;0.5451&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;6&lt;/TD&gt;&lt;TD&gt;0.92276689&lt;/TD&gt;&lt;TD&gt;0.04595209&lt;/TD&gt;&lt;TD&gt;0.0513&lt;/TD&gt;&lt;TD&gt;0.5964&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;7&lt;/TD&gt;&lt;TD&gt;0.8768148&lt;/TD&gt;&lt;TD&gt;0.00522994&lt;/TD&gt;&lt;TD&gt;0.0487&lt;/TD&gt;&lt;TD&gt;0.6451&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;8&lt;/TD&gt;&lt;TD&gt;0.87158485&lt;/TD&gt;&lt;TD&gt;0.06006623&lt;/TD&gt;&lt;TD&gt;0.0484&lt;/TD&gt;&lt;TD&gt;0.6935&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;9&lt;/TD&gt;&lt;TD&gt;0.81151862&lt;/TD&gt;&lt;TD&gt;0.05330799&lt;/TD&gt;&lt;TD&gt;0.0451&lt;/TD&gt;&lt;TD&gt;0.7386&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;10&lt;/TD&gt;&lt;TD&gt;0.75821063&lt;/TD&gt;&lt;TD&gt;0.06929076&lt;/TD&gt;&lt;TD&gt;0.0421&lt;/TD&gt;&lt;TD&gt;0.7807&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;11&lt;/TD&gt;&lt;TD&gt;0.68891987&lt;/TD&gt;&lt;TD&gt;0.05741897&lt;/TD&gt;&lt;TD&gt;0.0383&lt;/TD&gt;&lt;TD&gt;0.819&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;0.6315009&lt;/TD&gt;&lt;TD&gt;0.02247774&lt;/TD&gt;&lt;TD&gt;0.0351&lt;/TD&gt;&lt;TD&gt;0.8541&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;0.60902316&lt;/TD&gt;&lt;TD&gt;0.01814238&lt;/TD&gt;&lt;TD&gt;0.0338&lt;/TD&gt;&lt;TD&gt;0.8879&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;14&lt;/TD&gt;&lt;TD&gt;0.59088079&lt;/TD&gt;&lt;TD&gt;0.03822865&lt;/TD&gt;&lt;TD&gt;0.0328&lt;/TD&gt;&lt;TD&gt;0.9207&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 01 Aug 2018 17:03:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483113#M25095</guid>
      <dc:creator>Fae</dc:creator>
      <dc:date>2018-08-01T17:03:17Z</dc:date>
    </item>
    <item>
      <title>Re: Principal Component Analysis - Optimal number of retained factors</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483114#M25096</link>
      <description>&lt;P&gt;Honestly, I think the answer is totally subjective here. I don't believe that there is a universally accepted answer. The scree plot might indicate 7 factors.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, I would say that if you (for example) choose the 5 factor solution, but find that factor 6 has a clear interpretation that makes sense in your application, that's a (again subjective) reason to include factor 6.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As far as the question about is 54% of the explained variability enough ... again there is no universal answer here, especially since every situation is different. For some data in some fields of application, 54% might be fantastic, while in other fields of application 54% might be poor.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Aug 2018 17:05:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483114#M25096</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-08-01T17:05:33Z</dc:date>
    </item>
    <item>
      <title>Re: Principal Component Analysis - Optimal number of retained factors</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483190#M25098</link>
      <description>&lt;P&gt;Useful advice is available here:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="http://documentation.sas.com/?docsetId=statug&amp;amp;docsetTarget=statug_factor_details05.htm&amp;amp;docsetVersion=14.3&amp;amp;locale=en" target="_self"&gt;http://documentation.sas.com/?docsetId=statug&amp;amp;docsetTarget=statug_factor_details05.htm&amp;amp;docsetVersion=14.3&amp;amp;locale=en&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Aug 2018 20:07:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483190#M25098</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-08-01T20:07:25Z</dc:date>
    </item>
    <item>
      <title>Re: Principal Component Analysis - Optimal number of retained factors</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483200#M25099</link>
      <description>&lt;P&gt;Hey,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;an additional criteria would be the parallel analysis by Horn (1965; &lt;A href="https://link.springer.com/article/10.1007%2FBF02289447).&amp;nbsp;" target="_blank"&gt;https://link.springer.com/article/10.1007%2FBF02289447).&amp;nbsp;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bye,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Daniel&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Aug 2018 20:47:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Principal-Component-Analysis-Optimal-number-of-retained-factors/m-p/483200#M25099</guid>
      <dc:creator>Daniel_Paul</dc:creator>
      <dc:date>2018-08-01T20:47:08Z</dc:date>
    </item>
  </channel>
</rss>

