BookmarkSubscribeRSS Feed
knveraraju91
Barite | Level 11

Dear ,

 

Ihave two large SDTM data sets. I need to find the number of subjid which are present in one datset and not in other dataset and viceversa. Is there any code that I can use.Thanks.

 

Thanks

4 REPLIES 4
ChrisNZ
Tourmaline | Level 20

If you can index your tables by subjid, you can then do this:

 



data WANT;
  merge TAB1(keep=SUBJID in=A) 
             TAB2(keep=SUBJID in=B);
  by SUBJID;
  if first.SUBJID and not(A and B);
  if A then SOURCE='TAB1'; else SOURCE='TAB2';
run;


 
PGStats
Opal | Level 21

Or use SQL

 

proc sql;
create table want as
select "TAB1" as source, subjid from TAB1 where subjid not in (select subjid from TAB2)
union all
select "TAB2" as source, subjid from TAB2 where subjid not in (select subjid from TAB1);
quit;
PG
FreelanceReinh
Jade | Level 19

If you're only interested in how many distinct, non-missing SUBJIDs there are in the first dataset and not in the second and vice versa:

proc sql;
select count(subjid) as only_in_tab1
from ((select subjid from tab1)
      except
      (select subjid from tab2));

select count(subjid) as only_in_tab2
from ((select subjid from tab2)
      except
      (select subjid from tab1));
quit;
knveraraju91
Barite | Level 11

Thank you all

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1205 views
  • 3 likes
  • 4 in conversation