06-23-2012 01:39 PM
Hi there. I am very new to SAS, and to statistics in general (actually I have never used SAS before and I am very much stuck, it took me a whole day just to figure out how to run the program and print out each line in the dataset). I have a dataset about lifestyle and health of the population of a country. The survey was conducted using a stratified clustered multistage design. I have to take a 30% sample from this and perform some descriptive statistical analysis. Each individual in the dataset has a weight assigned (this is because some provinces were over represented while others were under represented and also because different members in ahousehold had different probabilities of being selected). When I take my sample, do I take a simple random sample, and then include the weights afterwards when I calculate the means and the variance and so on? Or do I take the weights into account while I select the sample (so that I don't take too many individuals from over represented provinces)? What would the SAS code look like?
I really appreciate your feedback
06-23-2012 10:41 PM
If I understand the sampling plan correctly then you should be subsampling clusters (households) within strata (provinces, etc.). The analysis could look something like this:
/* Subsample 30% of clusters (households).
Extracts a subsample in dataset subStudy, adds variable SelectionProb */
proc surveySelect data=originalStudy out=subStudy rate=0.3;
cluster HouseholdID; /* identifies a household within a province */
/* Multiply original selection probabilities by subsampling probabilities
to create new weights */
newWeight = originalWeight * SelectionProb;
/* Estimate means and rates.
Households is an optional input dataset that gives the total number of
clusters (households) (in variable named _TOTAL_) per stratum (province) */
proc surveyMeans data=subWeightedStudy total=Households;
var LifeStyleClass HealthVar; /* Estimate LifeStyleClass rates and HealthVar mean */
SAS documentation contains a simple example of a stratified cluster sampling design analysis :