DATA Step, Macro, Functions and more

Stratified random sample by group with count conditions (and possibly weighting)

Reply
Occasional Contributor
Posts: 6

Stratified random sample by group with count conditions (and possibly weighting)

Hello,

I have a complicated sampling strategy that I am trying to program in SAS 9.4 and am hoping someone can help me. 

 

My dataset of ~150,000 includes patients at 1230 facilities that received two types of treatment.  ~4000 received treatment A while ~146,000 received treatment B.  I need to generate a dataset with no more than 15 patients selected from each facility, but keep as many of the 4000 A patients as I can.  So for example if a facility had 5 A patients I would keep all 5 A patients then draw a random sample of 10 B patients from that facility to get n=15.  If there are no A patients at the facility then I would take a random sample of 15 B patients, likewise if there are 15 A patients at the facility I would keep all 15 of them.

 

There are a few facilities where there are more than 15 A patients, so I would need to draw a random sample of 15 A patients for these facilities.  I would then lose a few A patients from my overall sample.

 

Because the number of patients A+B from each facility ranges from 0 to >1000, I don't know if I can use PPS or weighting in this algorithm or if I will need to adjust for sampling weights in my analysis later.

 

I have attached a small sample dataset.

 

Any help would be greatly appreciated!

 

Thanks,

Laura

SAS Super FREQ
Posts: 3,755

Re: Stratified random sample by group with count conditions (and possibly weighting)

Posted in reply to hendrixl114

Have you already attempted to use SURVEYSELECT for this sampling, or do you not know about that procedure?

The doc includes an example of PPS stratified sampling.

 

 

Occasional Contributor
Posts: 6

Re: Stratified random sample by group with count conditions (and possibly weighting)

I have used surveyselect, but I need a way to draw all of the As for each facility before sampling the Bs.  If I remove the A first from the dataset I need to be able to select only 15-A patients from each facility.  I think i need some sort of macro that will cycle through the observations at each faciity and apply the conditions, but I don't know how to do this.  I'm actually not as worried about the PPS as getting the correct number of A and B per facility.

 

Super User
Posts: 10,044

Re: Stratified random sample by group with count conditions (and possibly weighting)

Posted in reply to hendrixl114

Post some data here. Not excel. No one would like to download it from websit.

And don't forget post your output either.

Ask a Question
Discussion stats
  • 3 replies
  • 76 views
  • 0 likes
  • 3 in conversation