BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LEINAARE
Obsidian | Level 7

Hello,

 

I am using PROC SQL to perform a left join using four common variables to join one new variable.  The left dataset has 45,370,249 rows.  The right dataset has 14,496,317 rows.  The new table contains 45,370,500 rows.  I would like to identify which records from the right dataset are causing row duplication on the left dataset.  Could anyone offer suggestions?  Below is a copy of my program.

 

Thanks

 

proc sql;
	create table medical_1217 as
	select a.*, b.provspec
	from medical_1217_icd10 a
	left join provspec_merge b
	on a.orsid = b.orsid
	and a.recordid = b.recordid
	and a.log = b.log
	and a.file_type = b.file_type;
quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
Usually quickest way is to count by your ID groups in each data set and find one that's duplicated multiple points and trace it.

View solution in original post

2 REPLIES 2
Reeza
Super User
Usually quickest way is to count by your ID groups in each data set and find one that's duplicated multiple points and trace it.

LEINAARE
Obsidian | Level 7

Thank you for your help.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 346 views
  • 0 likes
  • 2 in conversation