DATA Step, Macro, Functions and more

Searching a string variable for various words contained in another variable

Accepted Solution Solved
Reply
Contributor
Posts: 66
Accepted Solution

Searching a string variable for various words contained in another variable

hi,

I am trying to find possible matches between two dataset by string variables, that contain the name of companies.

I did a full join of the two datasets.

I have experimented with compged but I would like to try another approach: count the words of the variable of the second dataset found in the variable in the first dataset.

 

for instance say after the join I have something like this

 

var1                                      var2

AAA BBB corporation          AAA BBB limited

AAA BBB corporation           AAA BBB corp.

AAA BBB corporation          CCC DDD EEE ltd

 

I would like to compute a variable that has the following values:

 

 

var1                                      var2                           score

AAA BBB corporation          AAA BBB limited          2

AAA BBB corporation          AAA BBB corp.            3

AAA BBB corporation          CCC DDD                   0

 

 

 

As you see in the second record,if possible, i would take into account punctuation.

Any help is, as always very appreciated.

thank you very much in advance

 

 


Accepted Solutions
Solution
‎02-24-2018 06:03 AM
Super Contributor
Posts: 320

Re: Searching a string variable for various words contained in another variable

Hello,

 

data want;
    set have;

    score=0;

    do i=1 to countw(var2," .");
        if find(var1, scan(var2,i," .")) then score=score+1;
    end;
run;

View solution in original post


All Replies
Solution
‎02-24-2018 06:03 AM
Super Contributor
Posts: 320

Re: Searching a string variable for various words contained in another variable

Hello,

 

data want;
    set have;

    score=0;

    do i=1 to countw(var2," .");
        if find(var1, scan(var2,i," .")) then score=score+1;
    end;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 101 views
  • 0 likes
  • 2 in conversation