Hi EveryBody,
I am working on a survey, and i am tring to see how can I use SAS for estimate Standard error and some of other statistics. I read a lot of things about that, and have concluded that the first thing, and probably the most important thing to do
is to determine wich design my survey is, thus i can use sas procedures.
What i need from you, if possible is :
1- confirming if this is correct :
I am sampling Housholds and after that interviewing all members of each of these households-----> So this is a one-stage Cluster sampling. IS THAT CORRECT?
2- if I am true in -1- then how can i use proc surveymeans etc..
Just tell SAS that CLUSTER is my Household variable ? and what else?
Thanks a lot
Sorry, I didn't include that in my first response. Yes, you will specify the total number of households in the population in the proc surveymeans like this:
proc surveymeans data=yourdata total=NumberOfHouseholdsInPopulation;
cluster Household;
var age;
weight YourFinalSampleWeight;
run;
Your weight is the inverse of the probability of selection of of a household, since you are taking all household members into your sample. So if you sampled 20 households out of 200, your weight would be YourFinalSampleWeight=1/(20/200)=10.
Am I in the right place for my question, if not, please show me where to post it.
Thank you.
Yes, it sounds like you have a cluster sample going on here, but it's difficult to tell without more information. As long as you have performed a simple random sample of households in your population and have not stratified first (e.g. drawn households separately in rural and urban areas, for example), then what you have is a single stage cluster sample. Let's say you were trying to estimate the average age. Then in SAS, you would issue the following commands:
proc surveymeans data=yourdata;
cluster Household;
var age;
weight YourFinalSampleWeight;
run;
where YourFinalSampleWeight is the variable in your dataset that contains the value of the final sample weight for the individual, which in this case would be the inverse probability of selection, assuming you are not making any non-response or post-stratification adjustments.
Best of luck!
Thanks a lot for your respose Statistician13,
Effectively it seems to be a one stage sampling because I slect all members of the households. and I can't do anything now about changing the design, the survey will be acheived soon, and my work is to get standard error, confidence interval etc...
Here I have 2 more questions:
1- Have I to tell SAS the _TOTAL_ of households in my frame population or sampling rate (if the total is not available)?
2- for the weights is the weight of my households MEMBERS.
for information ,yes, I have non-response adjustments.
Thanks
Sorry, I didn't include that in my first response. Yes, you will specify the total number of households in the population in the proc surveymeans like this:
proc surveymeans data=yourdata total=NumberOfHouseholdsInPopulation;
cluster Household;
var age;
weight YourFinalSampleWeight;
run;
Your weight is the inverse of the probability of selection of of a household, since you are taking all household members into your sample. So if you sampled 20 households out of 200, your weight would be YourFinalSampleWeight=1/(20/200)=10.
Great.
And the sampling method of the households doesn't have an impact on my SAS procedure, in other words, is it the same syntax if sampled my households by a SRS or any other method?
I'm not sure I fully understand your last questions. The sampling rate of the households will be taken into account when you calculate the rates. You will need to add the CLUSTER statement to your SAS code, so it's not the same as SRS Code.
this is what i mean :
for instance for this two possibilities of my cluster sampling :
1 - First sampling the households(clusters) with a simple random sampling and after that select all members of the households.
2 - First sampling the households(clusters) with a probability sampling size and after that select all members of the households.
are this two possibilities the same for SAS procedure, i don't have to tel him that i user a simple random sampling for my clusters?
i hope I am more clear now.
thank you
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.