## Compare two data sets and keep only the records not in both

Solved
Regular Contributor
Posts: 216

# Compare two data sets and keep only the records not in both

Hi

I have two data sets, one of which is a sub-set of the other. What I would like to do is to create a third data set that contains the records that are not common to both data sets. Does anyone know of a clean way to do this?

Paul

Accepted Solutions
Solution
‎04-23-2013 03:06 PM
Regular Contributor
Posts: 216

Paul

All Replies
Posts: 5,543

## Re: Compare two data sets and keep only the records not in both

If dataset B is a subset of dataset A then

proc sql;

create table A_minus_B  as

select * from A

EXCEPT

select * from B;

quit;

selects rows from A which do not appear in B

PG

PG
Regular Contributor
Posts: 216

## Re: Compare two data sets and keep only the records not in both

This statement would not work because it does not give an indicator of a column to match on? Even though the records are a sub-set, the columns are slightly different between the two.

Paul

Posts: 5,543

## Re: Compare two data sets and keep only the records not in both

Please look up documentation for SQL set operators in

SAS(R) 9.3 SQL Procedure User's Guide

PG

PG
Regular Contributor
Posts: 227

## Re: Compare two data sets and keep only the records not in both

This macro provides one way to do that

http://www.sascommunity.org/wiki/Macro_Extract

Regular Contributor
Posts: 227

## Re: Compare two data sets and keep only the records not in both

Using the EXCEPT Operator in PROC SQL

and Generation Data Sets to Produce a Comparison Report

Stanley Fogleman,

http://www.nesug.org/proceedings/nesug06/cc/cc10.pdf

Solution
‎04-23-2013 03:06 PM
Regular Contributor
Posts: 216

Paul

Posts: 5,543

## Re: Compare two data sets and keep only the records not in both

I would like to know what it is. Could you please share? - PG

PG
Posts: 3,852

## Re: Compare two data sets and keep only the records not in both

Is it this last example from the page you cited.

## Producing Rows from the First Query or the Second Query

There is no keyword                  in PROC SQL that returns unique rows from the first and second table,                  but not rows that occur in both.  Here is one way that you can simulate                  this operation:
`(query1 except query2) union (query2 except query1)`
This example shows how                  to use this operation.
`proc sql; title 'A EXCLUSIVE UNION B'; (select * from sql.a except select * from sql.b) union (select * from sql.b except select * from sql.a);`
Producing Rows from the First Query or the Second Query
The first EXCEPT returns                  one unique row from the first table (table A) only. The second EXCEPT                  returns one unique row from the second table (table B) only. The middle                  UNION combines the two results. Thus, this query returns the row from                  the first table that is not in the second table, as well as the row                  from the second table that is not in the first table.
🔒 This topic is solved and locked.