I would like to keep one record when the names are the same. for example in my sample data, Mark Jones & Jones Mark are the same and I only want to keep on record, not both.
data have;
input name1$ name2 $ ;
datalines;
Mark Jones
Jones Mark
Jane Doe
Doe Jane
John Doe
Susy Z
John Smith
;
run;
I would appreciate some help
data have;
input name1$ name2 $ ;
call sort(name1,name2);
datalines;
Mark Jones
Jones Mark
Jane Doe
Doe Jane
John Doe
Susy Z
John Smith
;
run;
proc sort data=have out=want nodupkey;
by name1 name2;
run;
Naturally, this doesn't account for spelling errors, middle name or middle initials (it might not account for capitalization differences but that's easy to fix) or other problem with the recording of people's names. It does handle properly names that are palindromes.
data have;
input name1$ name2 $ ;
call sort(name1,name2);
datalines;
Mark Jones
Jones Mark
Jane Doe
Doe Jane
John Doe
Susy Z
John Smith
;
run;
proc sort data=have out=want nodupkey;
by name1 name2;
run;
Naturally, this doesn't account for spelling errors, middle name or middle initials (it might not account for capitalization differences but that's easy to fix) or other problem with the recording of people's names. It does handle properly names that are palindromes.
Thanks for the caveat! This worked for my actual data set. I appreciate the help
HI @NewSASPerson Curious fun puzzle, spicing up with HASH
data have;
input name1$ name2 $ ;
datalines;
Mark Jones
Jones Mark
Jane Doe
Doe Jane
John Doe
Susy Z
John Smith
;
run;
data want ;
if _n_=1 then do;
dcl hash H (ordered: "A") ;
h.definekey ("name1","name2") ;
h.definedone () ;
end;
set have;
if h.check() ne 0 and h.check(key:name2,key:name1) ne 0;
h.add();
run;
proc print noobs;run;
name1 | name2 |
---|---|
Mark | Jones |
Jane | Doe |
John | Doe |
Susy | Z |
John | Smith |
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.