04-19-2018 10:51 AM
I have two data sets with an identifier variable. I would like to know which ID variables are not in the base set: So my data is this:
DATA cars1; INPUT make $ group ID; CARDS; AMC 1 123 AMC 1 123 AMC 1 123 Buick 1 456 Buick 1 456 ; RUN; DATA cars2; INPUT make $ group ID; CARDS; AMC 1 123 AMC 1 123 AMC 1 123 Buick 1 456 Buick 1 456 Toyota 2 789 Honda 1 780 ; RUN;
I would like to have proc compare tell me that ID 780 and 789 are not in Cars 1.
Additionally, would I be able to use a where statement to restrict the comparison to only group 1? For example, I want to know which ID is not in the cars2 dataset restricted on only data that is part of group 1. The result would be 780.
Thanks in advance!
04-19-2018 10:59 AM
PROC COMPARE is very efficient to check are two tables identical and if negative
which variables in which observations are not equal, displaying theeir values.
To do what you want is very easy by sql:
proc sq; create table diff as /* <<< optional >>> */ select distinct ID from table_2 where ID not in (select distinct ID from table_1); quit;
04-19-2018 12:18 PM
use following code instead of PROC COMPARE.
Proc sort data=CARS1; by ID; Run; Proc sort data=CARS2; by ID; Run; data Difference; merge CARS1(in=A) CARS2(in=B); by ID MAKE Group; length inds $ 40; if A and not B then do; inds='CARS1'; output; end; if not A and B then do; inds='CARS2'; output; end; run;