BookmarkSubscribeRSS Feed
somebody
Lapis Lazuli | Level 10

I have to compare between 2 company name variables to match data. I am using the three functions in the title.

compged = compged(name,name2);
complev = complev(name,name2);
spedis = spedis(name,name2);

What are the minimum and maximum values for each function? and what does it mean? To my understanding, the lower the better. 0 means an exact match. However, COMPGED returns a very high number (600 or 1000) for a closely matched obs. 

The attached photo shows some examples. The first 4 rows would be a match and the bottom 2 would not be a match. The 3 functions return very high numbers. I am aware that this is because of missing letters in variable NAME2 for the first 4 rows. But what I want to achieve is a return of a match for the first 4 rows, and no match for the last 2. However, using the values from the 3 functions cannot help to determine that.

What would be a better approach?

 

somebody_0-1590913065052.png

 

 

 

 

3 REPLIES 3
Ksharp
Super User
spedis() stand for Spell Distance .
comp*() stand for Edited Distance
There is something different between them ,depend on different scenario .
ChrisHemedinger
Community Manager

Some articles that might help:

 

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!
ballardw
Super User

From a quick look at your example you should seriously consider using the options to ignore case. Some of the results you get are being inflated because of case differences.

 

The documentation on the functions tells you what is considered and used to assign values. One note from the documentation is that Compged and Complev are faster than Spedis.

 

And Compged can work with Call Compcost so you set the rules for how much certain of the rules set though not an exercise for the faint of heart. This might be a serious advantage if you know a lot about some behaviors between the sources of the strings.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 2106 views
  • 6 likes
  • 4 in conversation