Sounds like you want the X-loadings (and possibly the Y-weights) from PROC PLS, which can be obtained using
ods output xloadings=xloadings yweights=yweights;
Thank you, but how does one actually go about using them in a regression? Do I have to combine the x-loadings and y-weights in a way to make a singular "joint factor" (where X and Y are both taken into account, as is the purpose of PLS) 1, "joint factor" 2, etc?
Well this is why I asked you to explain what you meant by factors.
If you are talking about the regression equation from PLS, then you are not talking about "factors". So in one message, you seem to want loadings and in the next message you seem to want regression coefficients, which are not the same thing.
So, again I ask you to explain what exactly you want to find out from PLS and how it will be used.
I must not be very clear about what I'm trying to achieve. Hope this is better:
Instead of using PCA to create "factors" to use as variables in a regression model, my interest is to use PROC PLS to create "factors" that explain the variation in both the predictor and response variables and not just the predictor variables as PCA analyses do. I then want to use these "factors" in a regression model.
While you could use the factors (loadings) to derive the regression (and I'm sure you could look up the formula to do so), if you want the regression, you would bypass the loadings and go straight to the regression equation that PROC PLS produces.
I'm still not sure why you want the factors (loadings), they are useful in many ways, but you haven't really discussed that, you seem to really want the regression equation. So again, I'm not really sure what you want.
The parameter estimates in the regression equation can be obtained via
ods output parameterestimates=parameterestimates;
and these can be used in PROC SCORE to create predicted values.
The reason why I'm not interested in just the regression coefficients is because I want to use the factors (loadings) in a regression equation with an outcome measure that's not part of the response variables. And, I want to be able to use the factors (loadings) in the regression in addition to confounder variables that I want to add into the model.
Please correct me if I'm wrong, but the ods output parameterestimates would give me the equivalent of coefficients for the group of predictor variables that bests predicts the group of response variables?
tyang wrote:
The reason why I'm not interested in just the regression coefficients is because I want to use the factors (loadings) in a regression equation with an outcome measure that's not part of the response variables. And, I want to be able to use the factors (loadings) in the regression in addition to confounder variables that I want to add into the model.
I'm sorry, but apparently my understanding of PLS does not allow me to come to an understanding of what you are asking for. We are simply not communicating here. Again, you seem to me to be explaining tiny portions of the problem you are trying to solve, and I do not see the big picture, nor can I see where you have explained the big picture.
What is "an outcome measure that's not part of the response variables"? Why is that a part of this discussion?
Please correct me if I'm wrong, but the ods output parameterestimates would give me the equivalent of coefficients for the group of predictor variables that bests predicts the group of response variables?
Okay, I am correcting you. The PLS parameter estimates (or regression coefficients) that are computed are for ALL predictor variables. You get one set of regression coefficients for all X variables predicting Y1, another set of regression coeffients for all X variables predicting Y2, and so on for each Y. The loadings indicate the linear combinations of the predictor variables that are "highly predictive" of linear combinations of the response variables ("highly predictive" really means maximized squared covariance between linear combination of X and linear combination of Y, after accounting for the effects of previous dimensions...)
The OUTPUT statement will allow you to do this. It is the standard "Input the X vector, and leave the response as missing" approach that was common before PROC SCORE was available. See the documentation "Getting Started: PLS Procedure" for an example using spectrometric calibration.
Steve Denham
I will try PLS and see if that satisfies my result, thank you so much everybody for your input, i tried though using pca and managed to revert back the coefficients in order to show variable contribution
Can you please share how did you revert back the coefficients to show variable contribution using pca?
use the betas derived , multiply with the factor loadings across till n and then divide with the standard deviation to reverse the standardize coeffs , then you get the contribution back by dividing with the rsquare of each component * 100, downside is t significance
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.