DATA Step, Macro, Functions and more

Fuzzy match on the same variables

Reply
Contributor
Posts: 40

Fuzzy match on the same variables

Hi,

I have a timeseries dataset. In it, I have a variable that contains company names. But the company names change in many observations across time. I want to match and replace the company names so that companies  which are (obviously) the same have the same name in the dataset and not several different names.

 

Ex: XYZ,XYZ Company, XYZ & CO, XYZ CO etc. all will have one name such as for example 'XYZ Company'.

Contributor
Posts: 22

Re: Fuzzy match on the same variables

The name matching problem is addressed in chapter 81 of Professional SAS Programming Shortcuts with example code at http://www.globalstatements.com/shortcuts/81b.html. The example is based on the idea of matching two separate data sets but most of the same ideas would apply when the data is in a single data set. 

 

In a smaller version of the problem where there are only a few hundred values, you may just want to create a table that maps each distinct character value to the canonical form of the name that you would want to see.

Ask a Question
Discussion stats
  • 1 reply
  • 201 views
  • 0 likes
  • 2 in conversation