BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Season
Pyrite | Level 9

Hello, I am fascinated by the latent variable modeling capability of partial least squares and its extensions (e.g., partial least squares path modeling). I wonder if such a modeling paradigm has applications in survival analyses techniques such as Cox models. I browsed over literatures in survival analysis and found that current amendments to parameter estimate methods centers shrinkage (e.g., ridge regression, LASSO, etc.), without obvious applications of latent variable modeling in this field.

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

One example found by Google

https://pubmed.ncbi.nlm.nih.gov/24105836/

 

Also

Partial Least Squares generalized regression covers survival analysis, logistic regression, Cox regression and many other types of analyses, and there is an R package that does this. The logistic partial least squares algorithm is something I have coded in SAS (but its proprietary, owned by my employer and I cannot share it).

--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

One example found by Google

https://pubmed.ncbi.nlm.nih.gov/24105836/

 

Also

Partial Least Squares generalized regression covers survival analysis, logistic regression, Cox regression and many other types of analyses, and there is an R package that does this. The logistic partial least squares algorithm is something I have coded in SAS (but its proprietary, owned by my employer and I cannot share it).

--
Paige Miller
Season
Pyrite | Level 9

Thank you very much for your time spent! I have just finished learning the bulk of PLS logistic modeling. But the book section I read gave little information on the way of selecting the number of components. It just briefly said that cross-validation and goodness-of-fit statistics like the AIC and likelihood ratios can be used. Could you recommend more specific methods on that?

Prior to raising my questions here, I had known that PLS logistic regression is a possible choice. But in the field of survival analysis where censoring is common, the "incompatible" nature of logistic regression and all of its generalizations (excluding those that have generalized too far away that have been termed a different name instead of having a suffix "logistic regression", including Cox regression, which is a de facto generalization of conditional logistic regression) with missing data, the quality of the final results are conditional on the quality of imputation. Therefore, I sought to find methods that could handle missing data in other approaches. It is true that while Cox models can handle missing data of the dependent variables without imputation, imputation is a must when it comes to independent variables with missing data, but despite it is not measurable by a number, the dependence of the quality of results on imputation may decrease.

Thank you again!

PaigeMiller
Diamond | Level 26

The PROC PLS documentation contains examples of how to select the number of dimensions using cross validation.

 

Regarding AIC, Wikipedia explains:

 

To apply AIC in practice, we start with a set of candidate models, and then find the models' corresponding AIC values. There will almost always be information lost due to using a candidate model to represent the "true model," i.e. the process that generated the data. We wish to select, from among the candidate models, the model that minimizes the information loss. We cannot choose with certainty, but we can minimize the estimated information loss.

--
Paige Miller
Season
Pyrite | Level 9

Well, I wasn't asking about the definition of cross-validation and AIC. Rather, I was asking the way they can be applied to ascertaining the number of components in partial least squares logistic regression.

Thank you!

PaigeMiller
Diamond | Level 26

@Season wrote:

Well, I wasn't asking about the definition of cross-validation and AIC. Rather, I was asking the way they can be applied to ascertaining the number of components in partial least squares logistic regression.


I didn't give you a definition of these items. Both of my comments above are related to how these statistics can be applied.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 750 views
  • 1 like
  • 2 in conversation