Hi there,
I'm automating a 2-stage probability-based sample selection method in SAS that will be used on 20 or so different input data sets, each with its own specifications. In stage 1, I'm selecting PSUs proportional to size and in stage 2, I'm selecting SSUs (individuals here) SRS within PSUs. The number of SSUs in each PSU is not equal, but that's not a big deal.
In my implementation, I use PROC SURVEYSELECT with METHOD=PPS. I have once gotten an error reading "For METHOD=PPS, the relative size of each sampling unit must not exceed (1/SAMPSIZE)" which I understand. It was well explained here: http://support.sas.com/kb/23/759.html
To fix it, I've manually gone into the input sample list, removed the PSU causing trouble, selected that PSU with certainty for my sample (which seems logical to me since the probability of selection is greater than 1), then selected the remaining n-1 PSUs via PPS using PROC SURVEYSELECT.
To me, this seems clunky and annoying, but I haven't been able to find another way to work around this issue. Does anyone know of a way to have SAS treat these PSUs as being selected with certainty rather than having it spit out an error that the probability of selection is greater than 1? I've looked at the other variations on the PPS method in SAS and they all seem to have the same way of calculating probability of selection.
Thank you!!!!
What happens if you force p=1 when p>1 for those specific records rather than remove the,
if p=1 it's automatically selected.
So the PROC automatically calculates the probability of selection based on the sample size specified, the size of the PSU, and the total size of the stratum which the PSU is in. The formula is quite straightforward and can be found here: http://support.sas.com/kb/23/759.html
I am looking for a way to override part of that when the probability of selection is calculated to be greater than 1.
If it wasn't 6AM I might remember that 🙂
It mentions using CERTSIZE or SAMPSIZE but I'm assuming you want a non manual way. One way would be to manually precalculate prob and then build the sampsize via a macro.
It's too bad SELECTALL doesn't override that issue.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.