Programming the statistical procedures from SAS

How to create random samples from dataset?

Reply
Occasional Contributor
Posts: 12

How to create random samples from dataset?

Hi,

I am trying to perform oversampling on my dataset (~200,000 observations), which consist of a flag variable of value 1 or 0. I want 100 samples with each sample to contain all the observations with flag=1 (~100 of them in total) and then randomly select ~750 of observations that have flag = 0. However, I seem to have some difficulty in getting what I want. I ended up with ~10,000 observations in total for each sample. And sometimes, my code takes forever to run. Can someone advice me on what is wrong?

My code is as follows:

data oversamples;

do sample=1 to 100;

        set fun;

        do i = 1 to _N_;

        if flag= 1 or (flag=0 and ranuni(sample+7320) < 0.003) then output;

        end;

    end;

run;

Thanks for any advice.

Frequent Contributor
Frequent Contributor
Posts: 94

Re: How to create random samples from dataset?

I think your set fun statement should be before the do loop.

Contributor
Posts: 37

Re: How to create random samples from dataset?

Why dont you try PROC SURVEYSELECT, it picks random samples.

proc surveyselect data=data_set

   method=srs n=100 out=data_random;

run;

Occasional Contributor
Posts: 12

Re: How to create random samples from dataset?

Hi akberali,

My output is now really random, I want it to include all elements with flag=1 and then randomly select ~800 elements with flag=0. Is it possible to create that using the proc surveyselect?

Thanks for the advice

SAS Super FREQ
Posts: 3,547

Re: How to create random samples from dataset?

Use the STRATA option and set the sample rate for the "rare event category" to be 100%. See http://www.nesug.org/proceedings/nesug07/sa/sa02.pdf, beginning at the bottom of page 2.

Ask a Question
Discussion stats
  • 4 replies
  • 521 views
  • 1 like
  • 4 in conversation