BookmarkSubscribeRSS Feed
aha123
Obsidian | Level 7

I just read this web page about PCA:

http://www.alglib.net/dataanalysis/principalcomponentsanalysis.php

It says, "...In this case, the PCA will give preference to the first (less informative) variable. This drawback is closely connected to the fact that the PCA does not perform linear separation of classes, linear regression or other similar operations, but it merely permits the input vector to be best restored on the basis of the partial information about it. All additional information pertaining to the vector (such as the identification of an image with one of the classes) is ignored..."

If you know any good books elaborating on this point, please let me know.

1 REPLY 1
Rick_SAS
SAS Super FREQ

Jackson (1991), A user's guide to principal components, is an excellent book on PCA. There is a section on PCA vs linear discriminant analysis (LDA) on pp. 334-337, including references to the use of PCA in discriminant analysis. A more recent book is Hastie, Tibshirani, and Friedman, The Elements of Statistical Learning, which is more advanced, but less explicit than Jackson. The issue raised in the web page you quote is covered on pp. 93-94.

In general, PCA and LDA are two different analyses with different objectives. PCA tries to find directions (=linear combinations fo variables) that explain the most variance. LDA tries to find a linear subspace that best separates groups.  Tjhey can be combined into canonical discriminant analysis, which in SAS is accomplished by using the CANDISC procedure. The overview section of the CANDISC doc discusses how CDA works. From a practical perspective, CDA works pretty well in many cases.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1382 views
  • 1 like
  • 2 in conversation