BookmarkSubscribeRSS Feed
SeanZ
Obsidian | Level 7

I have a problem like this. I have two tables. One table only contains one variable, patient ID. The second one has two variables, patient ID and another variable, say, patient age. the second dataset is very large. Now I would like to subsample the second dataset to include ONLY patient IDs that are in the first table.

 

I tried to write someting like below, but it didn't work. BY NOT USING merge, is it possible to do this in the similiar way I wrote? Thank you.

 

proc sql;

   create table want as

  select *

  from table b

 where b.patientID in a.patientID;

quit;

2 REPLIES 2
art297
Opal | Level 21

Your code is close. Try something like:

 

data small;
  input patientID;
  cards;
1
2
3
;
data large;
  input patientID age;
  cards;
1 10
2 11
3 12
4 13
5 14
6 15
7 16
;

proc sql;
  create table want as
    select b.*
      from small a,large b
        where b.patientID eq a.patientID
  ;
quit;

Art, CEO, AnalystFinder.com

kiranv_
Rhodochrosite | Level 12

this should work

proc sql;

   create table want as

  select a.*

   from bigtable a

 inner join smalltable b

on  a.patientID = b.patientID;

quit;

 

or

 

proc sql;

   create table want as

  select*

   from bigtable

 where patientID in (select patientID from smalltable);

quit;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1057 views
  • 0 likes
  • 3 in conversation