BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Sandeep77
Lapis Lazuli | Level 10

I have two SAS datasets. First one is Sep_release and second one is Oct_release. Both the datasets have REFERENCE_NUMBER as common variables. I want to find if the REFERENCE_NUMBER released in sep_release is also coming in Oct_release. Basically trying to find the duplicate REFERENCE_NUMBER in both the datasets. But I am unsure of best way of doing it. I tried proc sort and data step but not getting the result as the code is not right and I am unsure of the right way to approach. Can you please suggest?

data duplicate_REFERENCE_NUMBER;
 set sep_release oct_release;
 if not (first.REFERENCE_NUMBER) then output;
run;

Also, I tried this step

Proc sort data=sep_release
nodupkey dupout=oct_release;
by REFERENCE_NUMBER;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
svh
Lapis Lazuli | Level 10 svh
Lapis Lazuli | Level 10
Using an inner join in PROC SQL should work:
proc sql;
create table duplicate_reference_number as
select a.REFERENCE_NUMBER
from sep_release a INNER JOIN oct_release b
on a.REFERENCE_NUMBER = b.REFERENCE_NUMBER;
quit;

View solution in original post

2 REPLIES 2
svh
Lapis Lazuli | Level 10 svh
Lapis Lazuli | Level 10
Using an inner join in PROC SQL should work:
proc sql;
create table duplicate_reference_number as
select a.REFERENCE_NUMBER
from sep_release a INNER JOIN oct_release b
on a.REFERENCE_NUMBER = b.REFERENCE_NUMBER;
quit;
Ksharp
Super User
data a;
set sashelp.class;
run;
data b;
set sashelp.class;
if _n_=1 then delete;
run;


proc sql;
create table duplicate_name as
select name from a
intersect
select name from b
;
quit;