Hi -
I'm doing a conditional all-to-all merge in SQL with the code below. It works; but seems silly to have to create the table twice. I'm looking for a way to select randomly from the rows of the joined table before writing it out.
proc sql;
create table inftrans as select
a.patientid as patientid, a.hospsize as hospsize, b.patientid as inf_source, 1 as newinf,
a.hospid as whereinf, a.daysin as dayinf
from work.patients (where=(infected=0)) as a, work.patients (where=(infected=1)) as b
where a.hospid=b.hospid & a.patientid^=b.patientid;
create table infrtrans as select * from inftrans
where ranbin(0,1,10/(hospsize**2))=1;
quit;
When I try adding the condition to the first where statement like this:
proc sql;
create table inftrans as select
a.patientid as patientid, a.hospsize as hospsize, b.patientid as inf_source, 1 as newinf,
a.hospid as whereinf, a.daysin as dayinf
from work.patients (where=(infected=0)) as a, work.patients (where=(infected=1)) as b
where a.hospid=b.hospid & a.patientid^=b.patientid & ranbin(0,1,10/(hospsize**2))=1;
quit;
inftrans will apply the ranbin() condition to a single row from the input dataset & thus be true multiple times.
So the question is if there is a cleaner way to get a random subset of the dyad-level dataset?
Thank you!
Jim
Hi @jmoody77 and welcome to the SAS Support Communities!
You could apply the condition involving RANBIN via a WHERE= dataset option in the first CREATE TABLE statement:
create table infrtrans(where=(ranbin(...)=1)) as ...
(Personally, I would prefer a positive random seed value to obtain a reproducible dataset, but that's your decision.)
Another option would be to create a view (CREATE VIEW statement) and to perform the random selection based on the view (be it with PROC SQL as you did or with PROC SURVEYSELECT if SAS/STAT is available or in a DATA step).
Thank you!!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.