<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Principle component analysis in Enterprise Miner in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Principle-component-analysis-in-Enterprise-Miner/m-p/480385#M7203</link>
    <description>&lt;P&gt;I can't really explain the difference ... but you are doing a lot of work to FORCE your data into the form needed for principal (not "principle") components, specifically continuous variables, and my first thought is to not do this. The results of principal components could be highly dependent on how you perform this transformation from categorical variables to continuous variables. There may be some better way of handling the non-continuous variables. But since you didn't really say much about your data, it's hard to say.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next, since you have a regression node, and I'm assuming that the output of principal components will be fed into the regression node ... DON'T DO THIS. Principal components is not looking to see whether or not the variables it selects are actually good predictors in the regression. Principal components could miss the variables that are good predictors in the regression. What should you do? Partial least squares (or PLS) regression! This picks combinations of variables that are good predictors, and as an extra added bonus, it has no trouble at all handling categorical variables as categorical. And so it's a lot simpler to do, there's no transformation of variables and there's no prior selecting of variables needed, PLS handles all of this.&lt;/P&gt;</description>
    <pubDate>Mon, 23 Jul 2018 10:45:49 GMT</pubDate>
    <dc:creator>PaigeMiller</dc:creator>
    <dc:date>2018-07-23T10:45:49Z</dc:date>
    <item>
      <title>Principle component analysis in Enterprise Miner</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Principle-component-analysis-in-Enterprise-Miner/m-p/480335#M7201</link>
      <description>&lt;P&gt;Dear Sir,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have few questions regarding principle component analysis in Enterprise Miner. Below is my data process flow:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="process flow.PNG" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/21914i6A4C8611370D84CA/image-size/large?v=v2&amp;amp;px=999" role="button" title="process flow.PNG" alt="process flow.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The transformation node is to convert categorical data to dummy since principle component only allow numerical value. I have tested 2 types of principle component nodes. The classification algorithms that I plan to use is Decision tree and Logistic regression.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;The setting for the principle component nodes are below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="principle component setting.PNG" style="width: 284px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/21915iB0F62F67952151AE/image-size/large?v=v2&amp;amp;px=999" role="button" title="principle component setting.PNG" alt="principle component setting.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Principle component node setting&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="HP principal component setting.PNG" style="width: 283px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/21916iD4477885BF0A2B7F/image-size/large?v=v2&amp;amp;px=999" role="button" title="HP principal component setting.PNG" alt="HP principal component setting.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;HP principal component node setting&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The result for the nodes:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="result pronciple component.PNG" style="width: 436px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/21917iDAA65A63E8E05221/image-size/large?v=v2&amp;amp;px=999" role="button" title="result pronciple component.PNG" alt="result pronciple component.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="principal component number.PNG" style="width: 549px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/21918i8B36F7B0078C1CC5/image-size/large?v=v2&amp;amp;px=999" role="button" title="principal component number.PNG" alt="principal component number.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;We select the number of component when eigenvalue is more than 1. In this case, there is 42 components but the selected number of component is 20. My first question is that does the Apply maximum number to Yes under Max Number cutoff section of the properties setting limit the component number to be 20 even though the actual number is 42?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Second question is when the principal component node and HP principal component node to be used for dimensional reduction.&lt;BR /&gt;&lt;BR /&gt;My last question is whether Variable selection node can use to replace principle component node in dimensional reduction?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone explain more on this issue?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you in advance.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Potiu&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jul 2018 10:11:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Principle-component-analysis-in-Enterprise-Miner/m-p/480335#M7201</guid>
      <dc:creator>potiu</dc:creator>
      <dc:date>2018-07-23T10:11:13Z</dc:date>
    </item>
    <item>
      <title>Re: Principle component analysis in Enterprise Miner</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Principle-component-analysis-in-Enterprise-Miner/m-p/480385#M7203</link>
      <description>&lt;P&gt;I can't really explain the difference ... but you are doing a lot of work to FORCE your data into the form needed for principal (not "principle") components, specifically continuous variables, and my first thought is to not do this. The results of principal components could be highly dependent on how you perform this transformation from categorical variables to continuous variables. There may be some better way of handling the non-continuous variables. But since you didn't really say much about your data, it's hard to say.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next, since you have a regression node, and I'm assuming that the output of principal components will be fed into the regression node ... DON'T DO THIS. Principal components is not looking to see whether or not the variables it selects are actually good predictors in the regression. Principal components could miss the variables that are good predictors in the regression. What should you do? Partial least squares (or PLS) regression! This picks combinations of variables that are good predictors, and as an extra added bonus, it has no trouble at all handling categorical variables as categorical. And so it's a lot simpler to do, there's no transformation of variables and there's no prior selecting of variables needed, PLS handles all of this.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jul 2018 10:45:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Principle-component-analysis-in-Enterprise-Miner/m-p/480385#M7203</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-07-23T10:45:49Z</dc:date>
    </item>
  </channel>
</rss>

