BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8
Dataset A has a million Id's and Dataset B has million id's.
how to find the count of distinct id's that are common in two datasets?
3 REPLIES 3
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
PROC SQL - two SELECT DINSTINCT(keyvar1 keyvar2), one for each file and then a JOIN, possibly using a sub-query in the process.

For a DATA step approach, suggesting setting a VIEW for each file, then do two PROC SORT NODUPKEY with your BY variable list, then a MERGE with a BY statement, and using the IN= dataset option, you can then test your IN= variables for both files contributing to the MERGE.

Scott Barry
SBBWorks, Inc.
Patrick
Opal | Level 21
Hi

A SQL approach:

data haveA;
do id=1,2,2,3,4,5,5,5,6;
output;
end;
run;

data haveB;
do id=1,1,1,3,3,4,6;
output;
end;
run;

proc sql feedback;
select COUNT(*) as N_UniqueIds
from
( select distinct id from work.haveA) as A,
( select distinct id from work.haveB) as B
where A.id = B.id;
;
quit;


HTH
Patrick
SASPhile
Quartz | Level 8
Thanks Patrick.

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1140 views
  • 0 likes
  • 3 in conversation