Hi everyone,
I have two data sets with an identifier variable. I would like to know which ID variables are not in the base set: So my data is this:
DATA cars1;
INPUT make $ group ID;
CARDS;
AMC 1 123
AMC 1 123
AMC 1 123
Buick 1 456
Buick 1 456
;
RUN;
DATA cars2;
INPUT make $ group ID;
CARDS;
AMC 1 123
AMC 1 123
AMC 1 123
Buick 1 456
Buick 1 456
Toyota 2 789
Honda 1 780
;
RUN;
I would like to have proc compare tell me that ID 780 and 789 are not in Cars 1.
Additionally, would I be able to use a where statement to restrict the comparison to only group 1? For example, I want to know which ID is not in the cars2 dataset restricted on only data that is part of group 1. The result would be 780.
Thanks in advance!
PROC COMPARE is very efficient to check are two tables identical and if negative
which variables in which observations are not equal, displaying theeir values.
To do what you want is very easy by sql:
proc sq;
create table diff as /* <<< optional >>> */
select distinct ID from table_2
where ID not in (select distinct ID from table_1);
quit;
use following code instead of PROC COMPARE.
Proc sort data=CARS1; by ID; Run;
Proc sort data=CARS2; by ID; Run;
data Difference;
merge CARS1(in=A) CARS2(in=B);
by ID MAKE Group;
length inds $ 40;
if A and not B then do;
inds='CARS1';
output;
end;
if not A and B then do;
inds='CARS2';
output;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.