Solved: Finding the duplicate values between two datasets

Sandeep77 · Posted 03-24-2023 06:51 AM

I have two SAS datasets. First one is Sep_release and second one is Oct_release. Both the datasets have REFERENCE_NUMBER as common variables. I want to find if the REFERENCE_NUMBER released in sep_release is also coming in Oct_release. Basically trying to find the duplicate REFERENCE_NUMBER in both the datasets. But I am unsure of best way of doing it. I tried proc sort and data step but not getting the result as the code is not right and I am unsure of the right way to approach. Can you please suggest?

data duplicate_REFERENCE_NUMBER;
 set sep_release oct_release;
 if not (first.REFERENCE_NUMBER) then output;
run;

Also, I tried this step

Proc sort data=sep_release
nodupkey dupout=oct_release;
by REFERENCE_NUMBER;
run;

svh · Posted 03-24-2023 07:40 AM

Using an inner join in PROC SQL should work:
proc sql;
create table duplicate_reference_number as
select a.REFERENCE_NUMBER
from sep_release a INNER JOIN oct_release b
on a.REFERENCE_NUMBER = b.REFERENCE_NUMBER;
quit;

View solution in original post

svh · Posted 03-24-2023 07:40 AM

Using an inner join in PROC SQL should work:
proc sql;
create table duplicate_reference_number as
select a.REFERENCE_NUMBER
from sep_release a INNER JOIN oct_release b
on a.REFERENCE_NUMBER = b.REFERENCE_NUMBER;
quit;

Ksharp · Posted 03-24-2023 08:17 AM

data a;
set sashelp.class;
run;
data b;
set sashelp.class;
if _n_=1 then delete;
run;


proc sql;
create table duplicate_name as
select name from a
intersect
select name from b
;
quit;

Finding the duplicate values between two datasets

Re: Finding the duplicate values between two datasets

Re: Finding the duplicate values between two datasets

Re: Finding the duplicate values between two datasets

Catch up on SAS Innovate 2026