Hi:
Here's some feedback from the course instructors:
=== === Longest feedback from an instructor === ===
To create a match code, the steps on the slide on that page (slide 45) are followed.
First it takes the string John Q Smith and parses it into tokens:
Given Name: John
Middle Name: Q
Family Name: Smith
Then if there are any noise words those are removed:
<there aren’t any for the string>
Transformations are made:
John > Jon
Phonetics are applied:
<not sure if any are applied in this case>
Then based on sensitivity, relevant components (tokens) are determined:
At 85% middle name is not significant, but given name and family name are
At this point we have <START>Smith Jon <END>
There are lots of spaces for the last name (in case it is long) and lots of spaces for the first name, and one space for the middle name (but it’s not significant at 85% so it is removed).
Then a keyboard transformation is applied:
SMITH = 4B&~2
<blanks> = $
JON = C@P
=== ===
=== === More feedback from second instructor === ===
The earlier chapters discuss construction of match code strings “lightly”. Mostly in this chapter we are showing how a match code string can help us in entity resolution within a single data file. We explore a bit more about the QKB in subsequent chapters 7-9. A match definition (which generates a match code) is discussed in more detail in Ch9 *but* it is necessary to understand Ch7 & Ch8 before jumping to Ch9.
=== ===
Hope this helps. It sounds like a review of Chapters 7-9 is what they recommend.
Cynthia