BookmarkSubscribeRSS Feed
hendrixl114
Calcite | Level 5

Hello,

I have a complicated sampling strategy that I am trying to program in SAS 9.4 and am hoping someone can help me. 

 

My dataset of ~150,000 includes patients at 1230 facilities that received two types of treatment.  ~4000 received treatment A while ~146,000 received treatment B.  I need to generate a dataset with no more than 15 patients selected from each facility, but keep as many of the 4000 A patients as I can.  So for example if a facility had 5 A patients I would keep all 5 A patients then draw a random sample of 10 B patients from that facility to get n=15.  If there are no A patients at the facility then I would take a random sample of 15 B patients, likewise if there are 15 A patients at the facility I would keep all 15 of them.

 

There are a few facilities where there are more than 15 A patients, so I would need to draw a random sample of 15 A patients for these facilities.  I would then lose a few A patients from my overall sample.

 

Because the number of patients A+B from each facility ranges from 0 to >1000, I don't know if I can use PPS or weighting in this algorithm or if I will need to adjust for sampling weights in my analysis later.

 

I have attached a small sample dataset.

 

Any help would be greatly appreciated!

 

Thanks,

Laura

3 REPLIES 3
Rick_SAS
SAS Super FREQ

Have you already attempted to use SURVEYSELECT for this sampling, or do you not know about that procedure?

The doc includes an example of PPS stratified sampling.

 

 

hendrixl114
Calcite | Level 5

I have used surveyselect, but I need a way to draw all of the As for each facility before sampling the Bs.  If I remove the A first from the dataset I need to be able to select only 15-A patients from each facility.  I think i need some sort of macro that will cycle through the observations at each faciity and apply the conditions, but I don't know how to do this.  I'm actually not as worried about the PPS as getting the correct number of A and B per facility.

 

Ksharp
Super User

Post some data here. Not excel. No one would like to download it from websit.

And don't forget post your output either.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1024 views
  • 0 likes
  • 3 in conversation