BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Pooja2
Fluorite | Level 6

Hi All,

I have a data set containing distinct patients in each row and how many times they have seen a doctor in the last 6 months.

Date                     Patient              visit

28Jul2018            AA                     2

27Nov2018          BB                     1

19Aug2018          BB                     4

28Jul2018            AA                     2

27Nov2018          GG                    4

19Aug2018          CC                    2

 

I HAVE TO randomly select N number of patients until  visits counts reach 10.

 

Please help.

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Does this work for you?

 

data have;
input Date:date9. Patient $ visit;
format Date date9.;
datalines;
28Jul2018 AA 2
27Nov2018 BB 1
19Aug2018 BB 4
28Jul2018 AA 2
27Nov2018 GG 4
19Aug2018 CC 2
28Jul2018 AA 2
27Nov2018 BB 1
19Aug2018 BB 4
28Jul2018 AA 2
27Nov2018 GG 4
19Aug2018 CC 2
;

data temp;
    set have;
    c+1;
run;

data want(keep=Date Patient Visit);
    if 0 then set temp nobs=nobs;
    declare hash h(dataset:'temp');
    h.definekey('c');
    h.definedata(all:'Y');
    h.definedone();
    
    sum=0;
    do until (sum = 10);
        pick=ceil(rand('Uniform')*nobs);
        if (h.find(key:pick)=0) & ((10-sum) ge visit) then do;
            sum+visit;
            output;
            rc=h.remove(key:pick);
        end;
    end;
run;

View solution in original post

10 REPLIES 10
PeterClemmensen
Tourmaline | Level 20

reaches exactly 10 or 10 or above?

Pooja2
Fluorite | Level 6

exact 10.

PeterClemmensen
Tourmaline | Level 20

And you want to do so with PROC SURVEYSELECT or can it be in a data step?

Pooja2
Fluorite | Level 6

yes, it can be a data step. It doesn't have to be PROC SURVEY SELECT. Thanks!

PeterClemmensen
Tourmaline | Level 20

Does this work for you?

 

data have;
input Date:date9. Patient $ visit;
format Date date9.;
datalines;
28Jul2018 AA 2
27Nov2018 BB 1
19Aug2018 BB 4
28Jul2018 AA 2
27Nov2018 GG 4
19Aug2018 CC 2
28Jul2018 AA 2
27Nov2018 BB 1
19Aug2018 BB 4
28Jul2018 AA 2
27Nov2018 GG 4
19Aug2018 CC 2
;

data temp;
    set have;
    c+1;
run;

data want(keep=Date Patient Visit);
    if 0 then set temp nobs=nobs;
    declare hash h(dataset:'temp');
    h.definekey('c');
    h.definedata(all:'Y');
    h.definedone();
    
    sum=0;
    do until (sum = 10);
        pick=ceil(rand('Uniform')*nobs);
        if (h.find(key:pick)=0) & ((10-sum) ge visit) then do;
            sum+visit;
            output;
            rc=h.remove(key:pick);
        end;
    end;
run;
Pooja2
Fluorite | Level 6

@PeterClemmensen thank you so much for your help!!

 

Thanks!

PeterClemmensen
Tourmaline | Level 20

Anytime, glad to help 🙂

Pooja2
Fluorite | Level 6

one more question - I want to put a seed number so that every time I get the same response. may be in future I want to replicate my results. How can I do that.

PeterClemmensen
Tourmaline | Level 20

No problem. Simply put 

 

call streaminit(123);

in the data step 

Pooja2
Fluorite | Level 6

@PeterClemmensen Thanks for helping. one more question. lets say the data set will have '2' or '3' as values for their visit variable. there was a point when it reach to sum 9 and since there is no '1' to pick to make it 10 - it has to be either 2 or 3 - it goes into endless looping. can you suggest how to fix that.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1185 views
  • 0 likes
  • 2 in conversation