I have a problem like this. I have two tables. One table only contains one variable, patient ID. The second one has two variables, patient ID and another variable, say, patient age. the second dataset is very large. Now I would like to subsample the second dataset to include ONLY patient IDs that are in the first table.
I tried to write someting like below, but it didn't work. BY NOT USING merge, is it possible to do this in the similiar way I wrote? Thank you.
proc sql;
create table want as
select *
from table b
where b.patientID in a.patientID;
quit;
Your code is close. Try something like:
data small; input patientID; cards; 1 2 3 ; data large; input patientID age; cards; 1 10 2 11 3 12 4 13 5 14 6 15 7 16 ; proc sql; create table want as select b.* from small a,large b where b.patientID eq a.patientID ; quit;
Art, CEO, AnalystFinder.com
this should work
proc sql;
create table want as
select a.*
from bigtable a
inner join smalltable b
on a.patientID = b.patientID;
quit;
or
proc sql;
create table want as
select*
from bigtable
where patientID in (select patientID from smalltable);
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.