Hi All,
I have 2 datasets, which i need to match and delete those records which match. Below is sample of what i am referring to:
Dataset1:
Name1 |
Valentino Rossi |
Marc Marquez |
George Lorenzo |
Valentino |
Dataset2:
Name2 |
Valentino |
Marc |
i am using query below:
proc sql;
CREATE TABLE del_name as Select
A.*,
B.*
FROM Work.dateset1 as a
left join Work.Dateset2 as B
on A.Name like B.Name
;
run;
quit;
The Issue is that, it matches only exact cases and does not consider anything preceeding the first name.
Name1 | Name2 |
Valentino Rossi | |
Marc Marquez | Marc |
George Lorenzo | |
Valentino | Valentino |
The Output should look something like below:
Combined Data | |
Name1 | Name2 |
Valentino Rossi | Valentino |
Marc Marquez | Marc |
George Lorenzo | |
Valentino | Valentino |
Appritiate your help.
Thanks
data a; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Rossi Marc Marquez George Lorenzo Valentino ; data b; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Marc ; proc sql; create table want as select a.name as name1, b.name as name2 from a left join b on a.name like cats('%', b.name, '%'); quit;
data a; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Rossi Marc Marquez George Lorenzo Valentino ; data b; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Marc ; proc sql; create table want as select a.name as name1, b.name as name2 from a left join b on a.name like cats('%', b.name, '%'); quit;
Thanks udden2903 .
That was a bulls eye. I was unsure on how i could use that (cats('%', b.name, '%');).
Great Help. Thank you
If you want match the words start with in each other .
data Dataset1;
input Name $40.;
cards;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
run;
data Dataset2;
input Name $40.;
cards;
Valentino
Marc
;
run;
proc sql;
CREATE TABLE del_name as Select
A.name as name1,
B.name as name2
FROM Work.dataset1 as a
left join Work.Dataset2 as B
on A.Name eqt strip(B.Name)
;
quit;
Thank you Ksharp
data names;
length name $20;
input name;
datalines;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
run;
data nicknames;
length nickname $20;
input nickname;
datalines;
Valentino
Marc
;
run;
data test;
set names;
match=0;
do i=1 to xnobs;
set nicknames nobs=xnobs point=i;
if index(upcase(strip(name)),upcase(strip(nickname)))>0 then do; match=1; output; end;
end;
if match=0 then do;nickname=.;output;end;
drop match;
run;
proc print data=test;
run;
Thank you Ctrlx
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.