BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ryanb2
Quartz | Level 8

I'm querying an Oracle database and need to use SQL.  I'm trying to pull records from the oracle database where the year of birth is either 2019 or 2020 AND (match either DOB OR first name OR last name OR mother's last) from records in a SAS dataset.  When I do this I get duplicate records (in the millions) when I'd like to get only one distinct record even when the same record meets the where criteria for multiple records in the SAS dataset.  Any help is appreciated.

proc sql;
	create table OUTPUT as
	select 
		a.ID as CERTIFICATEID label='CERTIFICATEID', a.BIRTH_ID, 
		a.SEALED, a.CREATE_DATE,
		datepart(a.BIRTH_DATE) as B_DOB label='B_DOB' format=mmddyy10., a.CHILD_SEX as B_SEX label='B_SEX', 
		a.CHILD_NAME_FIRST as B_NAME_FIRST label='B_NAME_FIRST', a.CHILD_NAME_MIDDLE as B_NAME_MIDDLE label='B_NAME_MIDDLE', a.CHILD_NAME_LAST as B_NAME_LAST label='B_NAME_LAST', 
		a.PARENT_1_NAME_FIRST as B_NAME_FIRST_FATHER label='B_NAME_FIRST_FATHER', a.PARENT_1_NAME_MIDDLE as B_NAME_MIDDLE_FATHER label='B_NAME_MIDDLE_FATHER', a.PARENT_1_NAME_LAST as B_NAME_LAST_FATHER label='B_NAME_LAST_FATHER', 
		a.PARENT_2_NAME_FIRST as B_NAME_FIRST_MOTHER label='B_NAME_FIRST_MOTHER', a.PARENT_2_NAME_MIDDLE as B_NAME_MIDDLE_MOTHER label='B_NAME_MIDDLE_MOTHER', a.PARENT_2_NAME_LAST as B_NAME_LAST_MOTHER label='B_NAME_LAST_MOTHER'
	from ORACLE.TABLE as a, SAS_DATASET as e
	where 
		year(datepart(a.BIRTH_DATE)) in (2020 2019) /* Do not include if not state registered*/
		AND
		(
		a.CHILD_NAME_FIRST=e.D_NAME_FIRST OR
		a.CHILD_NAME_LAST=e.D_NAME_LAST OR
		put(datepart(a.BIRTH_DATE),mmddyy10.)=e.D_DOB OR
		a.PARENT_2_NAME_FIRST=e.D_NAME_FIRST_MOTHER
		)
	;
quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Ryanb2
Quartz | Level 8

That did it!  Thanks.  I also changed it to a left join but kept the where statement as is.  Much appreciated.

View solution in original post

5 REPLIES 5
Reeza
Super User
The OR condition is likely making it possible to join to the same row multiple times....I suspect you'll need to either modify that join condition somehow or do it in multiple steps to avoid duplicates and matching records that are already matched.
You are also using a cross join which is inefficient.
KathyKiraly
SAS Employee

Try the DISTINCT keyword in your SELECT for unique rows across all columns:

 

  select distinct
a.ID as CERTIFICATEID label='CERTIFICATEID', a.BIRTH_ID,
a.SEALED, a.CREATE_DATE,
...

Ryanb2
Quartz | Level 8

That did it!  Thanks.  I also changed it to a left join but kept the where statement as is.  Much appreciated.

Reeza
Super User
You need to mark the solution as correct, not your response.
Ryanb2
Quartz | Level 8

Oops!  Sorry.  How do correct this?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1955 views
  • 4 likes
  • 3 in conversation