Solved
Contributor
Posts: 20

# Comparing datasets and returning identical observations

I'm looking to compare two datasets that have just two variables each(Name & Num). Rather than return a listing of the differences between the datasets, what I'm after is a listing of identical obervations in the two datasets.

``````DATA ONE;
INPUT NAME NUM \$ ;
CARDS;
FRED   1234
WILMA  4567
BARNEY 8369
BAMBAM 7894
;
RUN;

DATA TWO;
INPUT NAME NUM \$ ;
CARDS;
FRED   1234
WILMA  8369
BETTY  7894
;
RUN;

PROC SORT DATA=ONE NODUPKEY;
BY NAME NUM;
RUN;

PROC SORT DATA=TWO NODUPKEY;
BY NAME NUM;
RUN;

DATA FINAL;
PROC COMPARE BASE=ONE COMPARE=TWO;
RUN;
OUTPUT:                              ``````

I can't seem to find anything that will allow me to report on identical oberservations.

Any suggestions?

Thanks

Accepted Solutions
Solution
‎08-22-2016 12:48 PM
Super User
Posts: 13,542

## Re: Comparing datasets and returning identical observations

First you'll likely get better results if Name is character.

Try this;

``````DATA ONE;
INPUT NAME \$ NUM \$ ;
CARDS;
FRED   1234
WILMA  4567
BARNEY 8369
BAMBAM 7894
;
RUN;

DATA TWO;
INPUT NAME \$ NUM \$ ;
CARDS;
FRED   1234
WILMA  8369
BETTY  7894
;
RUN;

PROC SORT DATA=ONE NODUPKEY;
BY NAME NUM;
RUN;

PROC SORT DATA=TWO NODUPKEY;
BY NAME NUM;
RUN;

data three;
merge one (in=inone)
two (in=intwo);
by name num;
if inone and intwo;
run; ``````

All Replies
Posts: 3,167

## Re: Comparing datasets and returning identical observations

Try DUPOUT= option in Proc Sort,

``````data _stack;
set one two;
run;

proc sort data=_stack dupout=_wanted_dup nodupkey;
by _all_;
run;``````
Solution
‎08-22-2016 12:48 PM
Super User
Posts: 13,542

## Re: Comparing datasets and returning identical observations

First you'll likely get better results if Name is character.

Try this;

``````DATA ONE;
INPUT NAME \$ NUM \$ ;
CARDS;
FRED   1234
WILMA  4567
BARNEY 8369
BAMBAM 7894
;
RUN;

DATA TWO;
INPUT NAME \$ NUM \$ ;
CARDS;
FRED   1234
WILMA  8369
BETTY  7894
;
RUN;

PROC SORT DATA=ONE NODUPKEY;
BY NAME NUM;
RUN;

PROC SORT DATA=TWO NODUPKEY;
BY NAME NUM;
RUN;

data three;
merge one (in=inone)
two (in=intwo);
by name num;
if inone and intwo;
run; ``````
☑ This topic is solved.