Hi,
I have to run a set of correlations on a multitude of variables from different datasets against one set variable. I have run factor analysis to reduce the number of variables in the different datasets, but am unsure of how to use the factor scores on the original dataset in order to run correlations.
I am attaching the original dataset I used for the Factor Analysis along with the Factor Scores it output. In the attached dataset, Q1-Q4 is Factor 1 and Q5 is Factor 2. How should I use the factor scores?
My objective is to run correlations of Factor 1 & Factor 2 with the one set variable I have.
Also note that while the factor analysis was done at the total geography level, the correlation analysis data is at the individual geography level.
Can anyone please help?
Thank you!
I have to run a set of correlations on a multitude of variables from different datasets against one set variable. I have run factor analysis to reduce the number of variables in the different datasets, but am unsure of how to use the factor scores on the original dataset in order to run correlations.
Unfortunately, I consider this too vague to provide a suggestion on what you should do. In particular, you talk about "run correlations" which to me is not something I would do after doing a factor analysis. Perhaps it would help if you take a step back and describe in words, at a high level, the data and the goals of the analysis without going down to the next level of detail about what exact statistical methods you think you should use.
Nor does your description of your data set help, and most of us will not download Microsoft Office files as they are a security threat. The proper way to present data here in the SAS Communities is via these instructions.https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/
I should have added apologies in advance to my vague note above because I felt it would be vague even as I was writing it 😞
The correlation exercise is being done in order to see if an existing survey output can be replaced by another one. One such replacement may be from the 5 scores that's in the dataset I posted. So I was running a FA to reduce those with the hope of then running a correlation of the reduced factors with the existing survey output variable.
HTH?
| Q1 | Q2 | Q3 | Q4 | Q5 | Factor1 | Factor2 | |
| 53.73 | 51.44 | 53.42 | 46.42 | 69.84 | 0.258001075 | 0.647390095 | |
| 54.45 | 52.26 | 54.04 | 47.11 | 69.96 | 0.910862931 | 0.963478325 | |
| 53.7 | 51.69 | 53.4 | 46.58 | 70.14 | 0.442257929 | 0.90326093 | |
| 54.89 | 52.82 | 54.65 | 48.2 | 70.79 | 1.660837816 | 1.77537136 | |
| 51.54 | 49.14 | 51.16 | 44.6 | 69.62 | -1.663734433 | -0.200556119 | |
| 51.71 | 49.56 | 51.95 | 45.65 | 68.33 | -1.607211737 | -1.041845404 | |
| 53.23 | 51.01 | 53.02 | 46.56 | 69.16 | -0.258323196 | 0.005303993 | |
| 52.11 | 49.26 | 51.6 | 44.9 | 68.95 | -1.597151714 | -0.601888451 | |
| 53.48 | 50.63 | 52.23 | 45.71 | 69.89 | -0.369418323 | 0.472224307 | |
| 53.52 | 51.03 | 52.93 | 45.96 | 70.3 | 0.066324904 | 0.887321494 | |
| 52.83 | 50.47 | 52.11 | 45.41 | 70.98 | -0.277845071 | 1.203947433 | |
| 52.55 | 50.67 | 52.13 | 45.53 | 69.75 | -0.604627497 | 0.259908893 | |
| 52.63 | 50.38 | 52.09 | 45.29 | 70.06 | -0.633596296 | 0.465206974 | |
| 53.12 | 50.84 | 52.28 | 45.75 | 69.79 | -0.375641776 | 0.38480552 | |
| 53.69 | 51.58 | 52.8 | 46.5 | 69.65 | 0.129248027 | 0.472531862 | |
| 52.98 | 51.05 | 52.43 | 46.22 | 69.78 | -0.255704503 | 0.407790789 | |
| 52.6 | 50.45 | 52.22 | 46.26 | 69.22 | -0.747456386 | -0.137740195 | |
| 53.52 | 51.31 | 53.1 | 46.34 | 69.67 | 0.050752841 | 0.457357297 | |
| 54.53 | 51.78 | 53.69 | 47.89 | 66.93 | -0.140803183 | -1.399213122 | |
| 55.13 | 52.96 | 54.44 | 48.81 | 66.88 | 0.61734577 | -1.172211796 | |
| 57.13 | 55.32 | 57.1 | 50.97 | 66.26 | 2.388636521 | -0.943603271 | |
| 55.49 | 53.87 | 55.14 | 49.62 | 65.34 | 0.773845484 | -2.145014425 | |
| 55.87 | 54.18 | 55.24 | 50.89 | 65.82 | 1.233400816 | -1.663826491 |
I'm still lost as I don't udnerstand the reason for running a correlation after you run a factor analysis.
So again I ask for you to describe the problem you are trying to solve, in words, at a high conceptual level, without discussing specific statistical methods.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.