DATA Step, Macro, Functions and more

Compare two data sets and keep only the records not in both

Accepted Solution Solved
Reply
Regular Contributor
Posts: 216
Accepted Solution

Compare two data sets and keep only the records not in both

Hi

I have two data sets, one of which is a sub-set of the other. What I would like to do is to create a third data set that contains the records that are not common to both data sets. Does anyone know of a clean way to do this?

Paul


Accepted Solutions
Solution
‎04-23-2013 03:06 PM
Regular Contributor
Posts: 216

Re: Compare two data sets and keep only the records not in both

I found an answer actually.

Paul

View solution in original post


All Replies
Respected Advisor
Posts: 4,646

Re: Compare two data sets and keep only the records not in both

If dataset B is a subset of dataset A then

proc sql;

create table A_minus_B  as

select * from A

EXCEPT

select * from B;

quit;

selects rows from A which do not appear in B

PG

PG
Regular Contributor
Posts: 216

Re: Compare two data sets and keep only the records not in both

This statement would not work because it does not give an indicator of a column to match on? Even though the records are a sub-set, the columns are slightly different between the two.

Paul

Respected Advisor
Posts: 4,646

Re: Compare two data sets and keep only the records not in both

Please look up documentation for SQL set operators in

SAS(R) 9.3 SQL Procedure User's Guide

PG

PG
Regular Contributor
Posts: 198

Re: Compare two data sets and keep only the records not in both

This macro provides one way to do that

http://www.sascommunity.org/wiki/Macro_Extract

Regular Contributor
Posts: 198

Re: Compare two data sets and keep only the records not in both

see also

Using the EXCEPT Operator in PROC SQL

and Generation Data Sets to Produce a Comparison Report

Stanley Fogleman,

http://www.nesug.org/proceedings/nesug06/cc/cc10.pdf

Solution
‎04-23-2013 03:06 PM
Regular Contributor
Posts: 216

Re: Compare two data sets and keep only the records not in both

I found an answer actually.

Paul

Respected Advisor
Posts: 4,646

Re: Compare two data sets and keep only the records not in both

I would like to know what it is. Could you please share? - PG

PG
Respected Advisor
Posts: 3,777

Re: Compare two data sets and keep only the records not in both

Is it this last example from the page you cited.

Producing Rows from the First Query or the Second Query

First or second table, but not both
There is no keyword                  in PROC SQL that returns unique rows from the first and second table,                  but not rows that occur in both.  Here is one way that you can simulate                  this operation:              
(query1 except query2) union (query2 except query1)
This example shows how                  to use this operation.              
proc sql; title 'A EXCLUSIVE UNION B'; (select * from sql.a except select * from sql.b) union (select * from sql.b except select * from sql.a);
Producing Rows from the First Query or the Second Query
A EXCLUSIVE UNION B
The first EXCEPT returns                  one unique row from the first table (table A) only. The second EXCEPT                  returns one unique row from the second table (table B) only. The middle                  UNION combines the two results. Thus, this query returns the row from                  the first table that is not in the second table, as well as the row                  from the second table that is not in the first table.              
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 379 views
  • 4 likes
  • 4 in conversation