I have a dilemma and although I have searched the internet for answers I'm still confused.
I have a dataset with about 400 variables. I would like to reduce their number by applying proc varclus and then retaining a variable/cluster by using the centroid method.
Now, if I understood correctly this procedure is based on the R-squared that implies linearity. It's a powerfull hypothesis that I cannot test on all 400 variables.
My question is, does the procedure work for nonlinear relationships or not?
Are there any papers, as far as you know that treat this subject (proc varclus and non-linearity) that I could read?
Thank you
Yes, proc varclus (like PCA and FACTOR) implies linearity (and also normality).
On the other hand in a datamining context it usually works (unless you have extreme nonlinearities).
Search for "nonlinear PCA", "PROC PRINQUAL" ,"PROC NEURAL" if you worry about nonlinearity.
nonlinear PCA here:
http://support.sas.com/resources/papers/proceedings14/SAS313-2014.pdf
I don't know if it satisfy your demand .
Check proc corr's the fifth example .
Cronbach’s Coefficient Alpha
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.