Hello,
Let's say I've a dataset with 20 variables and that represent food intake in grams for 1000 individuals. I'm using Principal Component Analysis (PCA) and decided to retain two components, which represent dietary patterns in my case. The following code is used for this purpose.
ods graphics on;
proc factor data=dat
nfact=2
method=principal
priors=one
plot = scree
msa
score
flag=0.25
rotate=varimax
round
out=dat_scores (keep=sampleid wts_s factor:) outstat=FactOut;
var &food; /*20 food groups*/
run;
ods graphics off;
Factor loadings are used to interpret and label each factor or component. Nevertheless, when calculating factor or component scores by individual, PCA uses standardized scoring coefficients instead of factor loadings. Could you please explain me why is that? I've consulted many references and I have no found the reason for this.
Thanks a lot,
Alejandro.
Standardized (centered) variables — variables obtained by subtracting the original variable means and then divided by the original variable (e.g., component scores by individual) — are treated differently from the raw data. Unlike raw data, centered variables (with means 0 and standard deviations 1) are suitable for the linear combination formula, but not for the factor loading matrix. This formula is invoked by the SCORE option in the absence of NOINT option. With NOINT, the intercept is omitted from the analysis, covariances or correlations are not corrected for the mean, and even the SCORE option will not take standardized data [1].
If you are primarily interested in getting the component scores as linear combinations of the observed variables, the factor loading matrix table is not the right one for you. Again, when applying the [linear combination] formula you must use the standardized observed variables (with means 0 and standard deviations 1), but not the raw data. [2]
Raw data can be used with factor loadings.
References
[2] Principal Component Analysis
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.