Hi -
I'm doing a conditional all-to-all merge in SQL with the code below. It works; but seems silly to have to create the table twice. I'm looking for a way to select randomly from the rows of the joined table before writing it out.
proc sql;
create table inftrans as select
a.patientid as patientid, a.hospsize as hospsize, b.patientid as inf_source, 1 as newinf,
a.hospid as whereinf, a.daysin as dayinf
from work.patients (where=(infected=0)) as a, work.patients (where=(infected=1)) as b
where a.hospid=b.hospid & a.patientid^=b.patientid;
create table infrtrans as select * from inftrans
where ranbin(0,1,10/(hospsize**2))=1;
quit;
When I try adding the condition to the first where statement like this:
proc sql;
create table inftrans as select
a.patientid as patientid, a.hospsize as hospsize, b.patientid as inf_source, 1 as newinf,
a.hospid as whereinf, a.daysin as dayinf
from work.patients (where=(infected=0)) as a, work.patients (where=(infected=1)) as b
where a.hospid=b.hospid & a.patientid^=b.patientid & ranbin(0,1,10/(hospsize**2))=1;
quit;
inftrans will apply the ranbin() condition to a single row from the input dataset & thus be true multiple times.
So the question is if there is a cleaner way to get a random subset of the dyad-level dataset?
Thank you!
Jim
Hi @jmoody77 and welcome to the SAS Support Communities!
You could apply the condition involving RANBIN via a WHERE= dataset option in the first CREATE TABLE statement:
create table infrtrans(where=(ranbin(...)=1)) as ...
(Personally, I would prefer a positive random seed value to obtain a reproducible dataset, but that's your decision.)
Another option would be to create a view (CREATE VIEW statement) and to perform the random selection based on the view (be it with PROC SQL as you did or with PROC SURVEYSELECT if SAS/STAT is available or in a DATA step).
Thank you!!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.