Desktop productivity for business analysts and programmers

Fuzzy Lookup?

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Fuzzy Lookup?

I am new to SAS EG and I am not familiar with the programming side, only the point and click interface. 

I have one column titled 'Name', and another column titled 'Name and Address' (sample attached). I need to determine if the name is contained in the name and address column.

What would you suggest is the best way to perform this action? In the query builder I tried creating a computed column with the contains operator and a second attempt with the scan function (which I thought was closest to a fuzzy lookup) but those did not work for me. 

Any thoughts or guidance is much appreciated! Thanks in advance!


Accepted Solutions
Solution
4 weeks ago
Super User
Posts: 6,622

Re: Fuzzy Lookup?

Another issue that may apply is the presence of trailing blanks.  It is possible that the first comparison locates the NAME, when the second one doesn't:

 

location = index(name_plus_address, strip(name));

location = index(name_plus_address, name);

 

The STRIP function removes leading and trailing blanks.  The second statement (omitting STRIP), might be searching for a longer string that includes the trailing blanks ... probably less than ideal.

View solution in original post


All Replies
Super User
Posts: 23,224

Re: Fuzzy Lookup?

[ Edited ]

This is not a fuzzy lookup, you're looking for FIND or INDEX to search for a particular string instead. This assumes that comparisons are along a single row. If you need to compare the name from row 2 with the address/name from Row 1 that's a different problem.

 

This would be a calculated/computed column with the query builder, under Advanced Calculations most likely. You can use the INDEX or FIND function within the computed column builder.

 


@RandyW wrote:

I am new to SAS EG and I am not familiar with the programming side, only the point and click interface. 

I have one column titled 'Name', and another column titled 'Name and Address' (sample attached). I need to determine if the name is contained in the name and address column.

What would you suggest is the best way to perform this action? In the query builder I tried creating a computed column with the contains operator and a second attempt with the scan function (which I thought was closest to a fuzzy lookup) but those did not work for me. 

Any thoughts or guidance is much appreciated! Thanks in advance!


 

 

Solution
4 weeks ago
Super User
Posts: 6,622

Re: Fuzzy Lookup?

Another issue that may apply is the presence of trailing blanks.  It is possible that the first comparison locates the NAME, when the second one doesn't:

 

location = index(name_plus_address, strip(name));

location = index(name_plus_address, name);

 

The STRIP function removes leading and trailing blanks.  The second statement (omitting STRIP), might be searching for a longer string that includes the trailing blanks ... probably less than ideal.

New Contributor
Posts: 2

Re: Fuzzy Lookup?

Posted in reply to Astounding

Thanks to all! The following worked best for my needs but the other suggestions were good options too!

 

location = index(name_plus_address, strip(name));

Valued Guide
Posts: 556

Re: Fuzzy Lookup?

If you have mixed case then use FIND(Name_address,Name,'i')

Thanks,
Suryakiran
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 155 views
  • 0 likes
  • 4 in conversation