🔒 This topic is solved and locked.
Need further help from the community? Please
sign in and ask a new question.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 08-25-2016 05:39 PM
(2427 views)
Hello Everyone,
I have two datasets. One dataset is the subset of another dataset. I want to match both of the IDs . I want to find IDs from the subset dataset that wont match with the master dataset. For example:
Master dataset:
ID |
1 |
2 |
3 |
4 |
5 |
Subset dataset
ID |
2 |
4 |
5 |
9 |
When we match both datasets, My result should look like this:
9 |
1 ACCEPTED SOLUTION
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use the EXCEPT operator in SQL:
proc sql;
select id from Subset
except
select id from Master;
quit;
or the NOT IN operator
proc sql;
select id from Subset
where id not in (select id from Master);
quit;
PG
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data a;
input ID;
datalines;
1
2
3
4
5
;
data b;
input ID;
datalines;
2
4
5
9
;
proc sql;
create table want as
select ID
from b
where ID not in (select ID from a);
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use the EXCEPT operator in SQL:
proc sql;
select id from Subset
except
select id from Master;
quit;
or the NOT IN operator
proc sql;
select id from Subset
where id not in (select id from Master);
quit;
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A data step will be the fastest since the data is sorted.
data WANT;
merge MASTER(in=M) SUBSET(in=S);
by ID;
if S and not M;
run;
if you have duplicates and want to avoid them, the if clause becomes :
if first.ID and S and not M;