Data comparison

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Data comparison

Hi,

 

I have two data sets. One containes a list of formal fortuen 500 companies' name. The other contains patent information for a lot of companies which are extracted online, so the format of the companies' name may differ from the first data set, even if they refer to the same company. 

 

What I wanted to do is to select out the patent information for the companies who are among the fortune 500 companies. So, I have to compare the compare the names between the two data sets. I used soundslike (=*) statemete, and treat each of the fortune 500 companies as a macro variable. As the follwoing code shows:

 

data _null_;
set company0;
suffix=put(_n_,5.);
call symput(cats('company',suffix), company);
run;

 

data selected_patent;
set patent0;
where organization_name =* "&company1.";
company=put("&company1.", 25.);
run;

 

&company1. stands for the one of the fortune companies, but I have 500 macro variables like this. 

 

one way I can think of is to run macro program, but I have to list 500 call statement for macro program!!!!!!

 

Is there any easy way to finish this task, instead of using calling macro program for 500 times?

 

Thanks,

Sherri


Accepted Solutions
Solution
‎02-21-2016 11:53 AM
Super User
Posts: 5,424

Re: Data comparison

Am I missing something, or can't this be solved in a simple SQL join?
Data never sleeps

View solution in original post


All Replies
Super User
Posts: 19,770

Re: Data comparison

I usually recommend the solution in this thread, primarily FriedEgg's. He also, suggests a open source tool called the Link King, if you can install software. That's usually not an option where I work.

 

https://communities.sas.com/t5/SAS-Procedures/Name-matching/td-p/82780

 

Otherwise, you might want to consider a data step solution if that's an option. You can load the names into a temporary and loop through them, there should be code that I wrote for that solution somewhere on here. 

 

Calling a macro 500 times is easy if you use call execute, but not efficient. 

Solution
‎02-21-2016 11:53 AM
Super User
Posts: 5,424

Re: Data comparison

Am I missing something, or can't this be solved in a simple SQL join?
Data never sleeps
Occasional Contributor
Posts: 14

Re: Data comparison

No, you didn't miss anything. You are right! I lost myself!!

Thank you!
Sherri
Super User
Posts: 19,770

Re: Data comparison

Yes, it's a straightforward SQL, but the next step is typically how do I do this 'fuzzy' comparison with different options which is what FriedEgg's solution covers, using SOUNDEX, COMPGED and other comparison operators.

Occasional Contributor
Posts: 14

Re: Data comparison

Good resource! Thank you so much! Learned new stuff!!
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 427 views
  • 1 like
  • 3 in conversation