As has been noted by others, adding the ‘i’ modifier to the COMPLEV and COMPGED functions to ignore case will lower those values. But there’s always a risk of false positives like with your last two rows.
Assume you’re looking for a match for “Minnesota Power Company” and among your rows of data you have “MN Power Co.” and “Minnesota Shower Company”. “Minnesota Shower Company” requires only two small changes to become “Minnesota Power Company” and so is the closest match. It’s a problem that can be alleviated a bit by using fuzzy logic in addition to other join or search criteria, i.e., match on other fields if available and if those match then names can be fuzzy matched more safely. For example, you might have 8 companies at the same address with the only difference being unique PO #s that aren’t in the data, so all the addresses are the same. Now if our fuzzy functions find two company names at the same address to be very similar you can more safely assume they’re a valid match. But false positives are still possible. Unfortunately, “fuzzy” really is “fuzzy” and not the warm kind. This logic can be helpful but comes with inherent risks.
With that said, in this case, using the ‘i’ modifier and filtering on COMPGED < 2500 gets you what you want - first four rows included, last two excluded:
data test;
infile cards dsd missover;
input (Name1 Name2) (:$75.);
spedis=spedis(Name1,Name2);
complev=complev(Name1,Name2,'i');
compged=compged(Name1,Name2,'i');
if compged < 2500;
cards;
Northern Property Real Estate Investment Trust, NORTHERN PPTY REAL ESTATE INVT TR
Mapletree Commercial Trust Units Real Estate Investment Trust Reg, MAPLETREE COMMERCI
Soilbuild Business Space REIT Units Real Estate Investment Trust, SOILBUILD BUSINESS
Northern Property real Estate Investment Trust / NorSerCo. Inc., NORTHERN PPTY REAL ESTATE INVT TR
Shanghai Tonva Petrochemical Co Ltd H Shares, Shanghai Dasheng Agriculture Finance Technology Co Ltd Class H
China Rongsheng Heavy Industries Group Holdings Ltd. H Shares, China Huarong Energy Co Ltd
;
run;
Output:
This kind of stuff requires lots of testing, possibly extra logic and maybe the use of more than one fuzzy technique, and finally a decision about the level of risk you’re willing to accept.
... View more