Hi everyone,
my target variable is a binary variable and my task is the classic two-class classification problem. My colleague built both logistic regression model (using "Regression" node) and PLS logistic regression model (using "PLS" node) on our dataset. My question is:
Since PLS regression is a kind of many-to-many regression technique which combines the essence of principal component analysis, when there is only one respondent variable y (y ~ B(p), Bernoulli distribution) and PLS regression is applied, will we get the same or (at least) a very similar/comparable result to principal component regression?
Many thanks to all of you!
@YG1992 wrote:
Hi everyone,
my target variable is a binary variable and my task is the classic two-class classification problem. My colleague built both logistic regression model (using "Regression" node) and PLS logistic regression model (using "PLS" node) on our dataset. My question is:
Since PLS regression is a kind of many-to-many regression technique which combines the essence of principal component analysis, when there is only one respondent variable y (y ~ B(p), Bernoulli distribution) and PLS regression is applied, will we get the same or (at least) a very similar/comparable result to principal component regression?
Many thanks to all of you!
PLS does not "combine the essence of principal components analysis", the PLS vectors found and PCA vectors found are not going to the same. You should not be thinking that PLS gives the same or even similar results as PCA, because it does not.
PCA computes vectors in X without trying to find vectors that are predictive of your response variable. PLS find vectors in X that are predictive (as much as the data will allow) of the response variable.
There are papers that show examples of performing PLS when you have a binary response.
https://etd.auburn.edu/bitstream/handle/10415/5043/Thesis_McAtee.pdf;sequence=2
https://cedric.cnam.fr/fichiers/RC906.pdf
Since PLS regression is a kind of many-to-many regression technique which combines the essence of principal component analysis, when there is only one respondent variable y (y ~ B(p), Bernoulli distribution) and PLS regression is applied, will we get the same or (at least) a very similar/comparable result to principal component regression?
Not sure about with a binary predictor, but with a continuous predictor the results were almost identical to first calculating the Principal components and then doing a regression using the first few PC deemed relevant.
@Reeza wrote:
Since PLS regression is a kind of many-to-many regression technique which combines the essence of principal component analysis, when there is only one respondent variable y (y ~ B(p), Bernoulli distribution) and PLS regression is applied, will we get the same or (at least) a very similar/comparable result to principal component regression?
Not sure about with a binary predictor, but with a continuous predictor the results were almost identical to first calculating the Principal components and then doing a regression using the first few PC deemed relevant.
I'm going to have to disagree with this, for reasons explained in my reply above.
@YG1992 wrote:
Hi everyone,
my target variable is a binary variable and my task is the classic two-class classification problem. My colleague built both logistic regression model (using "Regression" node) and PLS logistic regression model (using "PLS" node) on our dataset. My question is:
Since PLS regression is a kind of many-to-many regression technique which combines the essence of principal component analysis, when there is only one respondent variable y (y ~ B(p), Bernoulli distribution) and PLS regression is applied, will we get the same or (at least) a very similar/comparable result to principal component regression?
Many thanks to all of you!
PLS does not "combine the essence of principal components analysis", the PLS vectors found and PCA vectors found are not going to the same. You should not be thinking that PLS gives the same or even similar results as PCA, because it does not.
PCA computes vectors in X without trying to find vectors that are predictive of your response variable. PLS find vectors in X that are predictive (as much as the data will allow) of the response variable.
There are papers that show examples of performing PLS when you have a binary response.
https://etd.auburn.edu/bitstream/handle/10415/5043/Thesis_McAtee.pdf;sequence=2
https://cedric.cnam.fr/fichiers/RC906.pdf
As far as I know PROC PLS does not support LOGISTIC model .
You could try PROC GAMPL or PROC ADAPTIVEREG
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.