turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Sampling a dataset without replacement.

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-29-2014 04:43 PM

I have a small dataset ( ~2000 obs) and ~20 vars of which the var X is really crucial. Var x has values ( a and b ) such that a:b =6:4. I want to create two samples ( 1800 Obs and 200 obs) of this dataset such that in final datasets it keeps the ratio of a :b in var x same (6:4). How shall i do that.

Accepted Solutions

Solution

10-30-2014
09:33 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RJ99

10-30-2014 09:33 AM

Separate the original dataset into two, one for x=a and the other for x=b.

From x=a data set sample 1080 (60% of 1800) and 120 (60% of 200). Do analogously for x=b data set.

Join the two subset to get the final file of 1800. Ditto for 200.

When I do a sample without replacement, I will often add a variable to the data generated with ranuni (or one of the other random number generators in SAS), sort on that variable and then just do a DATA step to sequentially pull the samples. You should also be able to do this with PROC SURVEYSELECT, but I like the simplicity in interpretation of the DATA step.

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RJ99

10-30-2014 04:55 AM

Can you please give some example of output datasets?

Solution

10-30-2014
09:33 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RJ99

10-30-2014 09:33 AM

Separate the original dataset into two, one for x=a and the other for x=b.

From x=a data set sample 1080 (60% of 1800) and 120 (60% of 200). Do analogously for x=b data set.

Join the two subset to get the final file of 1800. Ditto for 200.

When I do a sample without replacement, I will often add a variable to the data generated with ranuni (or one of the other random number generators in SAS), sort on that variable and then just do a DATA step to sequentially pull the samples. You should also be able to do this with PROC SURVEYSELECT, but I like the simplicity in interpretation of the DATA step.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RJ99

10-30-2014 10:18 AM

For more details on Doc's advice, see the article Sample without replacement in SAS - The DO Loop

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RJ99

10-30-2014 02:38 PM

Thanks a lot Doc@Duke and Rick. It is really helpful.