You need to clean the data. But that is not easy. So there are two ways. 1. sort and summarise the data by the company name. So you should be able to see the differences and then use this result to clean the company names using tranwrd function applying to the main data. company=tranwrd(compay,'mgmt','Management'); company=tranwrd(compay,'Mgmt','Management'); company=tranwrd(compay,'co','Comapny'); company=tranwrd(compay,'Co','Comapny'); You can use all of the above in the same data step. after the cleaning again summarise and check and clean. You might have to summarise few times until the entire data is cleaned. 2. Depending on you data, you can also follow the above procedure but before doing that you can split the company name in two or three parts using the scan function and then summarise on those words. This way you should be able to find the different spellings and clean them.
... View more