Hello,
I am using surveyselect to select from clustering data. I need a way to compute the sample size in clustering data or a method that minimize the variance so I do not have to use the sample size requirement. Any help is welcome. Thanks.
Right now, I am using the following code :
PROC SURVEYSELECT DATA=marco_estratificado out=sample method=srs SAMPSIZE=4952;
SAMPLINGUNIT ID_AGEB ;
RUN;
Regards,
Shary
@Shary wrote:
Hello,
I am using surveyselect to select from clustering data. I need a way to compute the sample size in clustering data or a method that minimize the variance so I do not have to use the sample size requirement. Any help is welcome. Thanks.
Right now, I am using the following code :
PROC SURVEYSELECT DATA=marco_estratificado out=sample method=srs SAMPSIZE=4952;
SAMPLINGUNIT ID_AGEB ;
RUN;
Regards,
Shary
You need to provide a little more information. Such as "minimize variance" of what?
Do you actually have more than 4952 clusters?
You have to supply something to tell SAS how many records you want.
It might help to provide a small example data set with a few clusters of records and what you would like as an example of what might be selected from that example data.
Hello:
Thank you for replying . Yes, I have more than 4952 clusters. What I want is to calculate the sample size for clusters. I saw in book that you can minimize the variance of the mean and the cost function using Lagrange.
@Shary wrote:
Hello:
Thank you for replying . Yes, I have more than 4952 clusters. What I want is to calculate the sample size for clusters. I saw in book that you can minimize the variance of the mean and the cost function using Lagrange.
In random sampling? I think you may need to read more carefully what that book was referring to.
Lagrange normally refers to building transforms from the data and would more typically be done after a selection, I would think. Not as part of a random sample selection.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.