BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
art297
Opal | Level 21

Lan,

For three character names, I have found the compged function to be inefficient. However, SAS provides a number of alternatives (e.g., soundex).

However, there are additional steps you can take to limit probable noise before looking for matches (see, e.g.: http://ftp.sas.com/techsup/download/observations/obswww15/obswww15.pdf ).

Finally, it would definitely help if you were matching with a set of valid company names. It sounds like you are trying to clean up both data sets simultaneously which, methinks, would only add more confusion.

Art

LanMin
Fluorite | Level 6

Thank you Art for sharing the resources and your advice ! I will read them carefully.

A little more details on my data:

main data: contains the universe of publicly traded U.S. firm names (I kept one unique record per firm, similar to your code).

second data: contains publicly traded company's customer name, these customers are themselves companies,

My main data has accounting data (e.g. firm cash holdings, debt, assets etc) of each firm, my goal is to get such info for my customer data, hence, I must match these two data sets.

In response to your latest comment, as far as I know both data contain valid company names, however, the database may not be build to users' satisfaction, the name abbreviation etc may not be used consistently across these two data sets.

Thanks again !

Lan

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 6695 views
  • 4 likes
  • 2 in conversation