Hi, I am currently working on a research project. What we need to do is to identify firms with different historical or recored companies and then classify them as unique companies.
For example, suppose we have some observations as follows:
1 the SAS Ltd. Co.
2 SAS Limited Co.
3 the SAS Limited
4 Company of SAS
We wish to find some tools that can be used to identify record 1, 2, 3, 4 as records for an unique company "SAS". As our dataset is very large with more than 10,000 observations which are all organization and company names, manual classification seems to be very time-consuming.
Thus, could any of you guys know whether the Data Quality Solution in SAS can help me to resolve this problem?
I have seen a couple of presentations on DQS and its capabilities, and from what I have seen this is a good example of what it can do.
For some more detailed examples on how it is used, and the sort of configuration required to match these records as belonging to one entity see the following presentation delivered at the SAS Aust and NZ user group (2007 Q4)