SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Research duplicates using DataFlux

Reply
Senior User
Posts: 1

Research duplicates using DataFlux

[ Edited ]

Hi, I'm starting to use DataFlux. I need to find duplicates in table. The documentation suggests to use the following nodes in a data job:
-Match code
-Clustering
-Surviving Record Identification
-Entity Resolution File Output
I don't understand how to configure these nodes and if this way is right to find duplicates and after to delete them.
Can you help me? Thank you

Super Contributor
Posts: 274

Re: Research duplicates using DataFlux

Posted in reply to AlmavivaSAS

Hi,

 

Check this SAS Global Forum paper How to Find Your Perfect Match Using SAS® Data Management

It touches on some of the topics you are asking about, beyond this, you'll have to find your way through the Online Docs to get more information about each individual node.

 

Good luck,

Ahmed

Ask a Question
Discussion stats
  • 1 reply
  • 125 views
  • 0 likes
  • 2 in conversation