Hello Everyone,
I have two datasets. One dataset is the subset of another dataset. I want to match both of the IDs . I want to find IDs from the subset dataset that wont match with the master dataset. For example:
Master dataset:
ID |
1 |
2 |
3 |
4 |
5 |
Subset dataset
ID |
2 |
4 |
5 |
9 |
When we match both datasets, My result should look like this:
9 |
Use the EXCEPT operator in SQL:
proc sql;
select id from Subset
except
select id from Master;
quit;
or the NOT IN operator
proc sql;
select id from Subset
where id not in (select id from Master);
quit;
data a;
input ID;
datalines;
1
2
3
4
5
;
data b;
input ID;
datalines;
2
4
5
9
;
proc sql;
create table want as
select ID
from b
where ID not in (select ID from a);
quit;
Use the EXCEPT operator in SQL:
proc sql;
select id from Subset
except
select id from Master;
quit;
or the NOT IN operator
proc sql;
select id from Subset
where id not in (select id from Master);
quit;
A data step will be the fastest since the data is sorted.
data WANT;
merge MASTER(in=M) SUBSET(in=S);
by ID;
if S and not M;
run;
if you have duplicates and want to avoid them, the if clause becomes :
if first.ID and S and not M;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.