We are a fairly inexperienced team working with DataFlux 2.7 to create a job that will read a long csv file containing primarily individual dependent names (already parsed into first, middle, last) as well as generic fields for alias names. We do not know if the alias names provided are first or last and may be a combination of both at times. Our business requirements request match codes for various combinations of these names. We have noticed we get more reliable match codes when we can combine a last name with a first name and run the name match on the combination vs. trying to match the individual parts. The output will be a text file of the original names and many, many match codes, 2 for each combination of name parts combined that have been requested via the requirements.
So, with the many different fields available to us, we are finding the job we need to create is becoming quite complex:
1. If alias 1 is not null, combine with dependent last name and create match codes at 90 and 75 sensitivities for name that looks like "alias as first name + dependent last name" else, pass back null match codes
2. If alias 1 is not null, combine with dependent first name and create match codes at 90 and 75 sensitivites for a name that looks like "dependent first name + alias as a last name" else, pass back null match codes
3. repeat these for 10 fields containing aliases but only generate a match code when alias is filled in for each of the 10.
4. additional matching based on dependent names only, etc.
We are trying to create a job that performs well and trying to understand best practices for using Expression Node and branching. We have a few older jobs that call out to other jobs and wondering if this is an approach to consider.
Any hints, best practices, or possibility to visit with someone with ideas to assist our design is appreciated.
Hi,
It has been my experience with name matching that you should try to generate match codes on the full name so you are headed in the right direction with that. You can use the Customize component to see exactly what the match code algorithm does to single token names, full names, and combinations. A few other thoughts:
Ron
Ron,
Thank you for responding to my questions and confirming the matching on full names. We will continue using full name matches for this job.
We have decided to use the branching to call an embedded job to determine match codes for the alias-dependent name combinations when the alias is not null. This creates a fairly easy-to-support job albeit several branches for the various aliases.
If this job does not test well, I will follow-up for more information on how to use the expression as you had described in the original response. So far, we are not seeing issues in testing.
I appreciate the help!
Tracy Bauer
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.