DATA Step, Macro, Functions and more

Ranuni function and imported Excel file with character data

Reply
New Contributor
Posts: 3

Ranuni function and imported Excel file with character data

[ Edited ]


You are a perinatal and reproductive epidemiologist and are given an Excel spreadsheet, Project3.xlsx, which contains 1000 records. There are a total of 5 hospitals in the dataset and in each of these hospitals, there were three possible types of infant outcomes: normal (>= 2500 grams), low birth weight (<2500 grams but >= 1500 grams), and very low birth weight (< 1500 grams).

 

You are interested in conducting a sub-study in which you would like to randomly select 3 very low birth weight, 10 low birth weight, and 25 normal infants from each hospital.

 

Trusted Advisor
Posts: 1,022

Re: Ranuni function and imported Excel file with character data

Posted in reply to Gpepsicola

Will I get course credit for doing this homework?

New Contributor
Posts: 3

Re: Ranuni function and imported Excel file with character data

Posted in reply to Gpepsicola
No. I'm looking for suggestions to complete it myself. Thanks for inquiring.
Trusted Advisor
Posts: 1,022

Re: Ranuni function and imported Excel file with character data

Posted in reply to Gpepsicola

OK, here's the randomizing part, untested.  It assumes dataset HAVE is sorted by hospital.  I know that RANUNI is deprecated but you can substitute SAS's improved uniform random number generator.

 

 

 

data want (drop=av_: w_: w ninfants);

  array avail {3} av_vlbw av_lbw av_nw;
  do w=1 to 3; avail{w}=0; end;

  array want  {3} w_vlbw w_lbw w_nw;
  w_vlbw=3;
  w_lbw=10; 
  w_nw=25;

  /* Get total available infants by weight group */
  do ninfants=1 by 1 until (last.hospital);
    set have;
    by hospital;
    w=1+(weight>=1500)+(weight>=2500);
    avail{w}=avail{w}+1;
  end;

  /* re-read and randomly draw */
  do until (last.hospital);
    set have;
    by hospital;
    w=1+(weight>=1500)+(weight>=2500);

    if want{w}/avail{w}>=ranuni(09184065) then do;
      output;
      want{w}=want{w}-1;
    end;
    avail{w}=avail{w}-1;
  end;
run;  
	  
New Contributor
Posts: 3

Re: Ranuni function and imported Excel file with character data

Thanks.

Super User
Posts: 19,832

Re: Ranuni function and imported Excel file with character data

Posted in reply to Gpepsicola

 

IMO you should really look into PROC SURVEYSELECT.

Super User
Posts: 10,041

Re: Ranuni function and imported Excel file with character data

Posted in reply to Gpepsicola

proc import datafile='/folders/myfolders/project3.xlsx' out=have dbms=xlsx replace;
run;
proc sort data=have;by hospnm bthwt;run;
proc surveyselect data=have sampsize=25 selectall out=temp;
strata hospnm bthwt;
run;
data want;
do until(last.bthwt);
 set temp;
 by hospnm bthwt;
end;
_bthwt=bthwt;

do i=1 by 1 until(last.bthwt);
 set temp;
 by hospnm bthwt;
 if _bthwt='VLBW' then do;
  if i le 3 then output;
 end;
 else  if _bthwt='LBW' then do;
        if i le 10 then output;
       end;
   else output;
end;
keep hospnm bthwt subjno;
run;
proc print noobs;run;
Super User
Posts: 10,041

Re: Ranuni function and imported Excel file with character data

Posted in reply to Gpepsicola

proc import datafile='/folders/myfolders/project3.xlsx' out=have dbms=xlsx replace;
run;
proc sort data=have;by hospnm bthwt;run;
proc surveyselect data=have sampsize=25 selectall out=temp;
strata hospnm bthwt;
run;
data want;
 set temp;
 by hospnm bthwt;
 if first.bthwt then i=0;
 i+1;
 if bthwt='VLBW' then do;
  if i le 3 then output;
 end;
 else  if bthwt='LBW' then do;
        if i le 10 then output;
       end;
   else output;

keep hospnm bthwt subjno;
run;
proc print noobs;run;


Trusted Advisor
Posts: 1,022

Re: Ranuni function and imported Excel file with character data

@Ksharp

 

I have a concern about your proposed solution:

 

You're asking SURVEYSELECT for a sample size of 25 for each of 3 birthweight group, so that later you can take the smaller sample sizes of 3 and 10 from 2 of those groups.  Let's say the hospital has 5 VLBW births (or any number between 4 and 25), in which case surveyselect would keep all the VLBW births, presumably in the original order.  Then you take the first 3.

 

Frequently the original data might be sorted within hospital, say by date of birth. Then your suggestion risks never drawing VLBW births at the end of the time period.  I think you either have to pre-sort the data in random order within hospital (in which case surveyselect is not needed at all), or run surveyselect 3 times, with 3 different WHERE statements and 3 sampsize values.

 

regards,

Mark

 

 

Super User
Posts: 10,041

Re: Ranuni function and imported Excel file with character data

Mike,
You are right. I never thought proc surveyselect would not change the order of obs 
when sample size is less than 25. Bad feature for proc surveyselect .

This could be simple as yours :



proc import datafile='/folders/myfolders/project3.xlsx' out=have dbms=xlsx replace;
run;
data have;
 set have;
 _rand=ranuni(0);
run;
proc sort data=have;by hospnm bthwt _rand;run;
data want;
 set have;
 by hospnm bthwt;
 if first.bthwt then i=0;
 i+1;
 if bthwt='VLBW' then do;
  if i le 3 then output;
 end;
 else  if bthwt='LBW' then do;
        if i le 10 then output;
       end;
   else output;

keep hospnm bthwt subjno;
run;
proc print noobs;run;

Super User
Posts: 10,041

Re: Ranuni function and imported Excel file with character data

Opps,
This ought to be right.



proc import datafile='/folders/myfolders/project3.xlsx' out=have dbms=xlsx replace;
run;
data have;
 set have;
 _rand=ranuni(0);
run;
proc sort data=have;by hospnm bthwt _rand;run;
data want;
 set have;
 by hospnm bthwt;
 if first.bthwt then i=0;
 i+1;
 if bthwt='VLBW' then do;
  if i le 3 then output;
 end;
 else  if bthwt='LBW' then do;
        if i le 10 then output;
       end;
   else do;
        if i le 25 then output;
   end;
keep hospnm bthwt subjno;
run;
proc print noobs;run;

Trusted Advisor
Posts: 1,022

Re: Ranuni function and imported Excel file with character data

@Ksharp

 

Nice program.  I like the idea of sorting by hospnm bthwt.  But I think further simplification is achieveable:

 

data have (drop=imax i rand);

  retain imax;

  set have;

  by hospnm bthwt;

  if first.bthwt then do;

    if bthwt='VLBW' then imax=3; else

    if bthwt='LBW' then imax=10; else

    if bthwt='Normal' then imax=25;

    i=0;

  end;

  i+1;

  if i<=imax;

run;

Trusted Advisor
Posts: 1,022

Re: Ranuni function and imported Excel file with character data

[ Edited ]
Posted in reply to Gpepsicola

My comment to @Ksharp prompted a thought of a simpler approach.  It has more steps, but doesn't require counting the number of available births, and preserves equal sampling probability for each record within a hospital/birthweight group.

 

Now editted to use orginal var names and values:

And edditted again - in my first edit I used strikeout font for variable hospital, which I see is not preserved in the final version.  Those are gone now.

 

data need / view=need;
  set have;
  rnum=ranuni(01672356);
run;
proc sort data=need out=sorted;
  by hospm  rnum;
run;
data want (drop=rnum w);
  set sorted;
  by hospnm ;
  array need {3} _temporary_ ;
  if first.hospnm then do;
    need{1}=3; need{2}=10; need{3}=25;
  end;
w=findw('VLBW LBW Normal',trim(bthwt),' ','e'); if need{w}>0; need{w}=need{w}-1; run;

 

Ask a Question
Discussion stats
  • 12 replies
  • 355 views
  • 1 like
  • 4 in conversation