I'm hoping someone can help me understand when it is appropriate to add a WEIGHT statement to PROC MEANS versus when do I need to use PROC SURVEYMEANS. I have read the documentation and a few papers on the topic, but I still can't quite get a handle on this for my specific situation(s). I have little experience with survey data, so I apologize in advance for my ignorance.
Situation #1: I'm using an insurance claims database and I'd like to standardize my estimates to the U.S. population using weights I have calculated from U.S. census data. In other words, I will apply weights such that the demographic distribution of people in my dataset matches the demographic distribution of the U.S. and no subpopulations are over- or under-represented. There was no complex sampling scheme for this database -- it is a convenience sample. Is using a WEIGHT statement with PROC MEANS sufficient if I want to calculate the weighted average of, say, healthcare expenditures, provider visits, etc.? Why or why not?
Situation #2: Using the same database consisting of a convenience sample of the U.S. population, I want to propensity-score weight 2 treatment groups. I have calculated weights such that the 2 groups will be similar on important baseline characteristics when the weights are applied. I want to compare outcomes (eg, healthcare expenditures, provider visits, ... etc) between the weighted groups. Is using a WEIGHT statement with PROC MEANS sufficient, or do I need to use SURVEYMEANS.
I suspect using a WEIGHT statement might be sufficient for both of these purposes, but the only examples I can seem to find are situations where the WEIGHT statment is NOT sufficient. Again, apologies for my complete ignorance on this topic! Any helpful input or papers you can recommend are greatly appreciated!
... View more