I am randomly picking up the samples(a) using PROC SURVEYSELECT and now
I want to pick up 10% control group from the samples picked up earlier and dont want to mix up this Control samples(b) with final data.For this do I need to pick up (b) also by proc surveyselect and merge (b) with (a) and take out them from final data or is there any other way? through which I can do in a single or simple steps
and avoid heavy programming.
If anyone can Please help me.
Thank You in advance.
There is a more "elegant" way to handle this kind of problem, I think. The trick is to "unsort" the original dataset. Of course, I would'nt recommend to do that if your orginal dataset is very large because sorting is involved. What is nice is that you simply add a flag to your original population indicating where the "record goes".
Here is an example. You'll have exactly 100 observations in the control group, 400 in the rest of sample and the rest would'nt be selected.
Here is the example:
do i=1 to 10000;
create table T02_population_unsorted as
order by ranuni(0);
when(_N_ le 100) group='CG';
when(_N_ le 500) group='RS';
Other possibility: you use a probability to decide "where the record goes". Even faster. You don't need to sort the dataset.
when(ranuni(0) le 0.1 ) group='CG';
when(0.1 lt ranuni(0) le 0.2) group='RS';
There are many other possibilities that would apply if the population is verrrryyyyy large, for example or if you want to extend it to stratified sampling or ... Let me know if you need further help.