Dear ,
Ihave two large SDTM data sets. I need to find the number of subjid which are present in one datset and not in other dataset and viceversa. Is there any code that I can use.Thanks.
Thanks
If you can index your tables by subjid, you can then do this:
data WANT;
merge TAB1(keep=SUBJID in=A)
TAB2(keep=SUBJID in=B);
by SUBJID;
if first.SUBJID and not(A and B);
if A then SOURCE='TAB1'; else SOURCE='TAB2';
run;
Or use SQL
proc sql;
create table want as
select "TAB1" as source, subjid from TAB1 where subjid not in (select subjid from TAB2)
union all
select "TAB2" as source, subjid from TAB2 where subjid not in (select subjid from TAB1);
quit;
If you're only interested in how many distinct, non-missing SUBJIDs there are in the first dataset and not in the second and vice versa:
proc sql;
select count(subjid) as only_in_tab1
from ((select subjid from tab1)
except
(select subjid from tab2));
select count(subjid) as only_in_tab2
from ((select subjid from tab2)
except
(select subjid from tab1));
quit;
Thank you all
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.