BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ddavies
Calcite | Level 5

I need to eliminate the middle initial off records that have a middle initial (not all have one). Data Sample:

Smith, Tim
Howard, Matthew L.
Jones, Sarah H.
Charles, Andrew N.
Doe, John

 

In excel, I can use this formula: =IF(RIGHT(A2,1)=".",LEFT(A2,LEN(A2)-3),A2) to get my desired output:

 

Smith, Tim
Howard, Matthew
Jones, Sarah
Charles, Andrew
Doe, John

 

How can I replicate this in SAS? I assume it is a combination of if and substring but have not been able to quickly work it out and thought someone might have a fast solution.

 

TIA!

1 ACCEPTED SOLUTION

Accepted Solutions
JerryV
SAS Employee

data have;
input fullname $40.;
datalines;
Smith, Tim
Howard, Matthew L.
Jones, Sarah H.
Charles, Andrew N.
Doe, John
;;;;
data want;
set have;
fullname=catx(', ',scan(fullname,1), scan(fullname,2));
run;

View solution in original post

10 REPLIES 10
JerryV
SAS Employee

data have;
input fullname $40.;
datalines;
Smith, Tim
Howard, Matthew L.
Jones, Sarah H.
Charles, Andrew N.
Doe, John
;;;;
data want;
set have;
fullname=catx(', ',scan(fullname,1), scan(fullname,2));
run;

ddavies
Calcite | Level 5

Thank you for the speedy response. Much appreciated.

ddavies
Calcite | Level 5

Unfortunately, if the last name has a hyphen, this causes the results to not be as I wanted. Any suggestions?

JerryV
SAS Employee

Try adding the comma modifier (',') to the scan functions like this:

 

data want;
set have;
fullname=catx(', ',scan(fullname,1,','), scan(fullname,2,','));
run;

PaigeMiller
Diamond | Level 26

The problem with all of these name clean-up rules is that they still have a wide variety of names to handle, and usually simple rules don't work 100% of the time. If a person's name is listed as 

 

Van Arsdale, Charles

 

then none of the easily programmed rules will work.

--
Paige Miller
novinosrin
Tourmaline | Level 20

data have;
input names $20.;
cards;
Smith, Tim
Howard, Matthew L.
Jones, Sarah H.
Charles, Andrew N.
Doe, John
;

data want;
 set have;
 call scan(names, -1, _n_, _iorc_);
 if _n_ then  substr(names,_n_)=' ';
run;
ddavies
Calcite | Level 5

This one ends up removing the first name if there is no middle initial so it would only work if all my names had a middle initial.

ddavies
Calcite | Level 5

I was able to combine some different ideas and came up with this:

data want;
set have;
last=substr(fullname,length(fullname));
format last :$1.;

if last = '.' then name=substr(fullname, 1, length(fullname)-3);
else name = fullname;
drop last fullname; run;
novinosrin
Tourmaline | Level 20

Hi @ddavies  My apologies. I shouldn't have used -1 to look behind from right. How about using 3 as the index value. I think this might nail it?

 

data want;
 set have;
 call scan(names, 3, _n_, _iorc_);
 if _n_ then  substr(names,_n_)=' ';
run;
novinosrin
Tourmaline | Level 20

Hi again @ddavies  How about regular expressions

 


data have;
input names $20.;
cards;
Smith, Tim
Howard, Matthew L.
Jones, Sarah H.
Charles, Andrew N.
Doe, John
;

data want;
 set have;
 new_names=prxchange('s/(\s*\w+,)(\s+\w+)(\s*\w*.)/$1$2/', -1, names);
run;
names new_names
Smith, Tim Smith, Tim
Howard, Matthew L. Howard, Matthew
Jones, Sarah H. Jones, Sarah
Charles, Andrew N. Charles, Andrew
Doe, John Doe, John

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1413 views
  • 0 likes
  • 4 in conversation