BookmarkSubscribeRSS Feed
mlogan
Lapis Lazuli | Level 10

Hi All,

I am looking for a code that will allow me to find duplicate based on exact as well as similar value. I know for exact value we can use NODUP, NODUPKEYS and NOUNIQUEKEYS. But can anyone tell me what function I can use to identify duplicate records based on similarity value.

 

Example: Following two records should be considered as duplicate though Last name and Address are not same (but they are similar).  

 

First Name      Last Name  Address

John                Murruy         1 New York St.

John                Murray         1 New York Street 

 

Thanks,

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

This question is a bit fuzzy in its nature because how similar should two strings be for them to be considered equal? Ie. when are two strings in two different observations similar enough to be considered duplicates and therefore omitted?

 

Two functions to get you going are the COMPLEV and COMPGED Functions. Both functions take two strings as input and return a number, which represents the 'distance' between two strings. 

SimonDawson
SAS Employee
If you have a license for the SAS Data Quality procedures I'd look into match code generation to solve this type of problem.

http://support.sas.com/documentation/cdl/en/dqclref/70016/HTML/default/viewer.htm#n1597gcbsehaokn1j5...

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1060 views
  • 2 likes
  • 3 in conversation