Hello,
I want two merge two databases:1) KLD, 2)Compustat
They have two common variables: 1)Year and 2)Company Name.
Because my data are panel data I have to use both common variables to merge.
However, company name is not exactly similar in two databases (they may different in terms of capital letters, abbreviations, spelling errors, etc.)
Could you please let me know the coding of how merging these two databases.
I am not familiar with SAS, but I have found out SAS can solve this problem.
If somebody give me the whole codes for this purpose, I would be very thankful.
Kind regards,
FA1
You will have to fix all those defects to merge the datasets. Depending on the modules you have licensed, this is more or less painful.
Sorry, but data-cleaning is not the most joyful task one can have.
Your first step will be to get to know your data and identify the data quality problems in detail. From there you can develop the necessary steps to rectify them.
Run proc contents first to see if you have discrepancies in terms of variable types and variable sizes.
Then run proc sort with nodupkey on your datasets (use the out= option to create new, intermediate datasets) and then compare the results, and look if you have multiple entries for the same business object.
Now, if you have SAS Data Quality licensed, this tool can do most of this work for you.
Dear KurBremser
Thank you so Much.
Yes, SAS Data Quality Server is the best option for me to solve this problem.
Actually there are two ways to overcome this problem 1)Scoring method, 2) Fuzzy matching using SAS Data Quality Server. I have attached a pdf that explains these two ways very well. It was very helpful for me.
Kind regards,
FA1
KLD and Compustat also have other common variables: CIK (Central Index Key, issued by the SEC), and stock ticker. Now stock ticker for a given company can change over time, so you would have to be aware of synchronization issues. And the database vendors may not have fully populated CIK. But I'd suggest getting those variables for matching purposes.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.