Hello,
I am using PROC SQL to perform a left join using four common variables to join one new variable. The left dataset has 45,370,249 rows. The right dataset has 14,496,317 rows. The new table contains 45,370,500 rows. I would like to identify which records from the right dataset are causing row duplication on the left dataset. Could anyone offer suggestions? Below is a copy of my program.
Thanks
proc sql;
create table medical_1217 as
select a.*, b.provspec
from medical_1217_icd10 a
left join provspec_merge b
on a.orsid = b.orsid
and a.recordid = b.recordid
and a.log = b.log
and a.file_type = b.file_type;
quit;