Hello here.
I am analysis national survey data with a complex study design therefore I would like to incorporate sampling weights
and account for clusters and stratum.
I am using food frequency questionnaire data. I derived 24 food groups from 112 food items. I want to use Principal
components analysis to derive Dietary patterns. Please which procedure can I use to to run principal components
analysis for this kind of study design?
I know the proc pls procedure and the proc factor , but these are seem to be for data obtained by simple random sampling.
Any assistance would be accepted.
You would need to compute the survey weights yourself (or perhaps you already have them), and save the weights in a variable in the data set. Then you could include this variable in the WEIGHT statement within either PROC PRINCOMP or PROC FACTOR. There is no WEIGHT statement in PROC PLS.
You would need to compute the survey weights yourself (or perhaps you already have them), and save the weights in a variable in the data set. Then you could include this variable in the WEIGHT statement within either PROC PRINCOMP or PROC FACTOR. There is no WEIGHT statement in PROC PLS.
Adding
The usual reason you want to perform principal components on the data is so that you can analyze the data in fewer dimensions, such as by plotting in fewer dimensions to detect outliers or clusters, or to use the fewer dimensions in some subsequent analysis.
The usual reason you do complex surveys is because you have clusters or strata that you can't (or haven't) sampled using the same proportion in each cluster or strata, and the survey weights allow you to estimate some population parameter taking into account the amount of sampling in each cluster or strata.
However, putting the two together, it sounds strange to me to use the survey weights to estimate "population principal components", as I don't see the purpose of surveys and weighting compatible with the idea of principal components. That doesn't mean you can't do it, but it does mean I'm not comfortable doing this (you may be comfortable doing this).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.