BookmarkSubscribeRSS Feed
ysk
Calcite | Level 5 ysk
Calcite | Level 5

Hi I want to clean up my data which contains full name.

 

For example,

 

1) John, Smith

2) Kevin, Space R.

3) John, Doe R.C.

4) Kevin, Space Joseph

 

Basically i want 2,3, and 4 to look like 1 and get rid of trailling letter or name (i.e. R., R.C., Joseph).

 

Thanks

 

6 REPLIES 6
Astounding
PROC Star

Here's a statement that should do the trick:

 

new_name = catx(', ', scan(name, 1, ', '), scan(name, 2, ', '));

 

Notice that you can still run into trouble with two-word names before the comma.  A mildly more complex approach could handle that.

ysk
Calcite | Level 5 ysk
Calcite | Level 5
Hi Astounding, thank you for your reply.

How about instances where names are like this.

1) John, P. Richard

how can i clean this up to align with the first example?

Thanks
Astounding
PROC Star

Well, if we're going to expand the list of possibilities let's make sure we cover them all.  I assume you want John, P. Richard to become John, Richard.  But what should each of these become?

 

John, R.

John, R.C.

John, R. C.

 

If you can think of any more possibilities, this would be the time to mention them.

ysk
Calcite | Level 5 ysk
Calcite | Level 5
Here are different variations (format) of names in the dataset

1. John,Doe -> John,Doe
2. John,Doe D. -> John,Doe
2. John,D. Doe -> John,Doe
3. John,Doe-Johnson -> John, Doe-Johnson


Astounding
PROC Star

OK, here's one way:

 

new_name = scan(name, 1, ',');

part2 = scan(name, 2, ',');

if part2 > ' ' then do;

   new_name = trim(new_name) || ',';

   do i=1 to countw(part2, ' ');

      next_piece = scan(part2, i, ' ');

      if index(next_piece, '.') = 0 then new_name = trim(new_name) || ' ' || next_piece;

   end;

end;

 

I'm not expecting any problems with it, but this is untested code so let me know if you have any difficulties with it.

Ksharp
Super User
data have;
input name $40.;
cards;
John, Smith
Kevin, Space R.
John, Doe R.C.
Kevin, Space Joseph
John,Doe 
John,Doe D.
John,D. Doe 
John,Doe-Johnson 
;
run;
data want;
 set have;
 temp=prxchange('s/[a-z]\./ /i',-1,name);
 new_name=catx(',',scan(temp,1,', '),scan(temp,2,', '));
 drop temp;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 888 views
  • 0 likes
  • 3 in conversation