Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Re: Sampling a dataset without replacement.

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-29-2014 04:43 PM
(1017 views)

I have a small dataset ( ~2000 obs) and ~20 vars of which the var X is really crucial. Var x has values ( a and b ) such that a:b =6:4. I want to create two samples ( 1800 Obs and 200 obs) of this dataset such that in final datasets it keeps the ratio of a 😛 in var x same (6:4). How shall i do that.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Separate the original dataset into two, one for x=a and the other for x=b.

From x=a data set sample 1080 (60% of 1800) and 120 (60% of 200). Do analogously for x=b data set.

Join the two subset to get the final file of 1800. Ditto for 200.

When I do a sample without replacement, I will often add a variable to the data generated with ranuni (or one of the other random number generators in SAS), sort on that variable and then just do a DATA step to sequentially pull the samples. You should also be able to do this with PROC SURVEYSELECT, but I like the simplicity in interpretation of the DATA step.

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Can you please give some example of output datasets?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Separate the original dataset into two, one for x=a and the other for x=b.

From x=a data set sample 1080 (60% of 1800) and 120 (60% of 200). Do analogously for x=b data set.

Join the two subset to get the final file of 1800. Ditto for 200.

When I do a sample without replacement, I will often add a variable to the data generated with ranuni (or one of the other random number generators in SAS), sort on that variable and then just do a DATA step to sequentially pull the samples. You should also be able to do this with PROC SURVEYSELECT, but I like the simplicity in interpretation of the DATA step.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

For more details on Doc's advice, see the article Sample without replacement in SAS - The DO Loop

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks a lot Doc@Duke and Rick. It is really helpful.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.