BookmarkSubscribeRSS Feed
ysk
Calcite | Level 5 ysk
Calcite | Level 5

Hi I want to clean up my data which contains full name.

 

For example,

 

1) John, Smith

2) Kevin, Space R.

3) John, Doe R.C.

4) Kevin, Space Joseph

 

Basically i want 2,3, and 4 to look like 1 and get rid of trailling letter or name (i.e. R., R.C., Joseph).

 

Thanks

 

6 REPLIES 6
Astounding
PROC Star

Here's a statement that should do the trick:

 

new_name = catx(', ', scan(name, 1, ', '), scan(name, 2, ', '));

 

Notice that you can still run into trouble with two-word names before the comma.  A mildly more complex approach could handle that.

ysk
Calcite | Level 5 ysk
Calcite | Level 5
Hi Astounding, thank you for your reply.

How about instances where names are like this.

1) John, P. Richard

how can i clean this up to align with the first example?

Thanks
Astounding
PROC Star

Well, if we're going to expand the list of possibilities let's make sure we cover them all.  I assume you want John, P. Richard to become John, Richard.  But what should each of these become?

 

John, R.

John, R.C.

John, R. C.

 

If you can think of any more possibilities, this would be the time to mention them.

ysk
Calcite | Level 5 ysk
Calcite | Level 5
Here are different variations (format) of names in the dataset

1. John,Doe -> John,Doe
2. John,Doe D. -> John,Doe
2. John,D. Doe -> John,Doe
3. John,Doe-Johnson -> John, Doe-Johnson


Astounding
PROC Star

OK, here's one way:

 

new_name = scan(name, 1, ',');

part2 = scan(name, 2, ',');

if part2 > ' ' then do;

   new_name = trim(new_name) || ',';

   do i=1 to countw(part2, ' ');

      next_piece = scan(part2, i, ' ');

      if index(next_piece, '.') = 0 then new_name = trim(new_name) || ' ' || next_piece;

   end;

end;

 

I'm not expecting any problems with it, but this is untested code so let me know if you have any difficulties with it.

Ksharp
Super User
data have;
input name $40.;
cards;
John, Smith
Kevin, Space R.
John, Doe R.C.
Kevin, Space Joseph
John,Doe 
John,Doe D.
John,D. Doe 
John,Doe-Johnson 
;
run;
data want;
 set have;
 temp=prxchange('s/[a-z]\./ /i',-1,name);
 new_name=catx(',',scan(temp,1,', '),scan(temp,2,', '));
 drop temp;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1097 views
  • 0 likes
  • 3 in conversation