08-29-2016 03:41 PM
I have a timeseries dataset. In it, I have a variable that contains company names. But the company names change in many observations across time. I want to match and replace the company names so that companies which are (obviously) the same have the same name in the dataset and not several different names.
Ex: XYZ,XYZ Company, XYZ & CO, XYZ CO etc. all will have one name such as for example 'XYZ Company'.
08-29-2016 04:09 PM
The name matching problem is addressed in chapter 81 of Professional SAS Programming Shortcuts with example code at http://www.globalstatements.com/shortcuts/81b.html. The example is based on the idea of matching two separate data sets but most of the same ideas would apply when the data is in a single data set.
In a smaller version of the problem where there are only a few hundred values, you may just want to create a table that maps each distinct character value to the canonical form of the name that you would want to see.