I have two data sets with the same variables, but different observations. I need to know if any observations in data set 1 are in data set 2. How do I do this? Do I merge them first?
A few details would be helpful.
Could data set 1 contain two identical observations? How would you like to handle that?
Do you need to identify observations that are 100% identical, or just largely identical?
PROC COMPARE is one option.
Another option is to put your full observation in one varible - and convert it to a hash, using MD5 or SHA.
Based on that you can use either data step merge or SQL inner join.
data have1; set sashelp.class; run; data have2; set sashelp.class end=last; output; if last then do;name='xxxx';output;end; run; proc sql; create table obs_in_both as select * from have1 intersect select * from have2 ; quit;
This will output all observations in B that match any observation in A, which satisfies your criterion as long as neither dataset has duplicates, and A and B have the same variables.
data a b;
set sashelp.class;
if mod(_n_,3)=0 then output a b;
else if mod(_n_,3)=1 then output a;
else output b;
run;
data both;
set b;
if _n_=1 then do;
declare hash ha (dataset:'a');
ha.definekey(all:'Y');
ha.definedone();
end;
if ha.find()=0;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.