That's not so easy. You basically need to do an outer join, on the condition that one or more words from one name is in the other name, or vice versa. Here is a possible solution that uses a datastep to create the outer product (by reading every observation from the second dataset with POINT=) and compare: data one;
input name1 $40. /amount1;
cards;
wwwamazoncom
100.5
toysrus
50.25
OLIVE GARDEN
61.85
walMart
86.24
;run;
data two;
input name2 $40. /amount2;
cards;
US AMAZON AR USA
25.68
online toysrus newjersey us
126.98
ORDER olivegarden Washington DC
29.99
us wwwwalmartcom toys texas
75.86
;run;
data want;
set one;
score=0;
length common $60;
do _N_=1 to nobs;
set two nobs=nobs point=_N_;
common=' ';
score=0;
do i=1 to countw(name1);
if length(scan(name1,i))<3 then continue;
if find(name2,scan(name1,i),'i') then do;
score=score+1;
if not findw(common,scan(name1,i),' ','i') then
call catx(' ',common,scan(name1,i));
end;
end;
do i=1 to countw(name2);
if length(scan(name2,i))<3 then continue;
if find(name1,scan(name2,i),'i') then do;
score=score+1;
if not findw(common,scan(name2,i),' ','i') then
call catx(' ',common,scan(name2,i));
end;
end;
if score>0 then output;
end;
run; Note that one pair of values were joined just on the word "toys", you may want to increase the length value in the line with "continue".
... View more