🔒 This topic is **solved** and **locked**.
Posted 10-29-2014 04:43 PM
I have a small dataset ( ~2000 obs) and ~20 vars of which the var X is really crucial. Var x has values ( a and b ) such that a:b =6:4. I want to create two samples ( 1800 Obs and 200 obs) of this dataset such that in final datasets it keeps the ratio of a 😛 in var x same (6:4). How shall i do that.

Separate the original dataset into two, one for x=a and the other for x=b.

From x=a data set sample 1080 (60% of 1800) and 120 (60% of 200). Do analogously for x=b data set.

Join the two subset to get the final file of 1800. Ditto for 200.

When I do a sample without replacement, I will often add a variable to the data generated with ranuni (or one of the other random number generators in SAS), sort on that variable and then just do a DATA step to sequentially pull the samples. You should also be able to do this with PROC SURVEYSELECT, but I like the simplicity in interpretation of the DATA step.

Can you please give some example of output datasets?

For more details on Doc's advice, see the article Sample without replacement in SAS - The DO Loop

Thanks a lot Doc@Duke and Rick. It is really helpful.

