Hi All,
I have 2 datasets, which i need to match and delete those records which match. Below is sample of what i am referring to:
Dataset1:
| Name1 |
| Valentino Rossi |
| Marc Marquez |
| George Lorenzo |
| Valentino |
Dataset2:
| Name2 |
| Valentino |
| Marc |
i am using query below:
proc sql;
CREATE TABLE del_name as Select
A.*,
B.*
FROM Work.dateset1 as a
left join Work.Dateset2 as B
on A.Name like B.Name
;
run;
quit;
The Issue is that, it matches only exact cases and does not consider anything preceeding the first name.
| Name1 | Name2 |
| Valentino Rossi | |
| Marc Marquez | Marc |
| George Lorenzo | |
| Valentino | Valentino |
The Output should look something like below:
| Combined Data | |
| Name1 | Name2 |
| Valentino Rossi | Valentino |
| Marc Marquez | Marc |
| George Lorenzo | |
| Valentino | Valentino |
Appritiate your help.
Thanks
data a;
infile datalines delimiter=',';
length name $30;
input name;
datalines;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
data b;
infile datalines delimiter=',';
length name $30;
input name;
datalines;
Valentino
Marc
;
proc sql;
create table want as
select a.name as name1,
b.name as name2
from a
left join b on a.name like cats('%', b.name, '%');
quit;
data a;
infile datalines delimiter=',';
length name $30;
input name;
datalines;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
data b;
infile datalines delimiter=',';
length name $30;
input name;
datalines;
Valentino
Marc
;
proc sql;
create table want as
select a.name as name1,
b.name as name2
from a
left join b on a.name like cats('%', b.name, '%');
quit;
Thanks udden2903 .
That was a bulls eye. I was unsure on how i could use that (cats('%', b.name, '%');).
Great Help. Thank you
If you want match the words start with in each other .
data Dataset1;
input Name $40.;
cards;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
run;
data Dataset2;
input Name $40.;
cards;
Valentino
Marc
;
run;
proc sql;
CREATE TABLE del_name as Select
A.name as name1,
B.name as name2
FROM Work.dataset1 as a
left join Work.Dataset2 as B
on A.Name eqt strip(B.Name)
;
quit;
Thank you Ksharp
data names;
length name $20;
input name;
datalines;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
run;
data nicknames;
length nickname $20;
input nickname;
datalines;
Valentino
Marc
;
run;
data test;
set names;
match=0;
do i=1 to xnobs;
set nicknames nobs=xnobs point=i;
if index(upcase(strip(name)),upcase(strip(nickname)))>0 then do; match=1; output; end;
end;
if match=0 then do;nickname=.;output;end;
drop match;
run;
proc print data=test;
run;
Thank you Ctrlx
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.