Hi All,
I have 2 datasets, which i need to match and delete those records which match. Below is sample of what i am referring to:
Dataset1:
Name1 |
Valentino Rossi |
Marc Marquez |
George Lorenzo |
Valentino |
Dataset2:
Name2 |
Valentino |
Marc |
i am using query below:
proc sql;
CREATE TABLE del_name as Select
A.*,
B.*
FROM Work.dateset1 as a
left join Work.Dateset2 as B
on A.Name like B.Name
;
run;
quit;
The Issue is that, it matches only exact cases and does not consider anything preceeding the first name.
Name1 | Name2 |
Valentino Rossi | |
Marc Marquez | Marc |
George Lorenzo | |
Valentino | Valentino |
The Output should look something like below:
Combined Data | |
Name1 | Name2 |
Valentino Rossi | Valentino |
Marc Marquez | Marc |
George Lorenzo | |
Valentino | Valentino |
Appritiate your help.
Thanks
data a; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Rossi Marc Marquez George Lorenzo Valentino ; data b; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Marc ; proc sql; create table want as select a.name as name1, b.name as name2 from a left join b on a.name like cats('%', b.name, '%'); quit;
data a; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Rossi Marc Marquez George Lorenzo Valentino ; data b; infile datalines delimiter=','; length name $30; input name; datalines; Valentino Marc ; proc sql; create table want as select a.name as name1, b.name as name2 from a left join b on a.name like cats('%', b.name, '%'); quit;
Thanks udden2903 .
That was a bulls eye. I was unsure on how i could use that (cats('%', b.name, '%');).
Great Help. Thank you
If you want match the words start with in each other .
data Dataset1;
input Name $40.;
cards;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
run;
data Dataset2;
input Name $40.;
cards;
Valentino
Marc
;
run;
proc sql;
CREATE TABLE del_name as Select
A.name as name1,
B.name as name2
FROM Work.dataset1 as a
left join Work.Dataset2 as B
on A.Name eqt strip(B.Name)
;
quit;
Thank you Ksharp
data names;
length name $20;
input name;
datalines;
Valentino Rossi
Marc Marquez
George Lorenzo
Valentino
;
run;
data nicknames;
length nickname $20;
input nickname;
datalines;
Valentino
Marc
;
run;
data test;
set names;
match=0;
do i=1 to xnobs;
set nicknames nobs=xnobs point=i;
if index(upcase(strip(name)),upcase(strip(nickname)))>0 then do; match=1; output; end;
end;
if match=0 then do;nickname=.;output;end;
drop match;
run;
proc print data=test;
run;
Thank you Ctrlx
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.