BookmarkSubscribeRSS Feed
Mary60
Calcite | Level 5

Hello,

Some of my output fields contain an extra comma at the end of the field. How can I eliminate that comma

For example, here is a "bad" record for a single "name" field containing a second comma in red(bad):

Smith,Vincent,

Here is the "good" record that only contains one comma and a space, not a second comma for mname(good):

Dobbins,Tyrone G

Here is another "good" record without a middle name

Smith, John

 

Here is how I am parsing the fname, mname and lname from the single "name" field, but the mname is giving me two fields mname for Smith

DATA NAMEA; SET NAM1;                  
  X=INDEXC(NAME1,',');                 
  Y=INDEXC(NAME1,' ');                  
  LNAME1=SUBSTR(NAME1,1,X-1);          
  FNAME1=SUBSTR(NAME1,X+1, Y-X);       
  MNAME1=SCAN(NAME1,3);                 
  IF (MNAME1 =FNAME1 ) THEN  MNAME1=' ';  

 

For the output on Vincent Smith, here's what I get:

VINCENT,, ,SMITH

 

Here's what I need:

VINNY, ,SMITH

 

How do I eliminate that second comma?

Thanks so much,

Mary

4 REPLIES 4
Mary60
Calcite | Level 5
I meant I need the output VINCENT, ,SMITH
Tom
Super User Tom
Super User

It is confusing what is the input and what is the output.

 

Are you starting with these strings as your input?

Smith,Vincent,
Dobbins,Tyrone G
Smith, John

And what is the output you want?

Do you want to make three variables with LAST, FIRST and MIDDLE names?  

Or do you want to make a comma delimited list of LAST,MIDDLE,FIRST?

 

Note: you will have a lot of trouble with people that use a space in the middle of their FIRST or LAST name.

 

To parse those input strings look for just the first comma location.

data have ;
  input string $50.;
cards;
Smith,Vincent,
Dobbins,Tyrone G
Smith, John
;

data want;
  set have;
  length first middle last new $50 ;
  loc=indexc(string,',');
  last=substr(string,1,loc-1);
  first=left(substr(string,loc+1));
  first=left(compbl(translate(first,' ',',')));
  loc=indexc(first,' ');
  middle=substr(first,loc+1);
  first=substr(first,1,loc-1);
  new =cat(trim(last),',',trim(middle),',',first);
  drop loc;
run;
proc print;
run;

Result

 Obs         string         first      middle     last            new

  1     Smith,Vincent,      Vincent              Smith      Smith, ,Vincent
  2     Dobbins,Tyrone G    Tyrone       G       Dobbins    Dobbins,G,Tyrone
  3     Smith, John         John                 Smith      Smith, ,John

 

 

Mary60
Calcite | Level 5
Tom, This is excellent. Thanks. We have solved elimination of the comma for the first observation, but now there's a guy with a Middle Initial following the second comma - see 4th line below. Would your code be the same for someone like him? Smith,Vincent, Dobbins,Tyrone G Smith, John Dobbins,Tyrone,L
Tom
Super User Tom
Super User

Just try it.

Spoiler
Obs         string         first      middle     last            new

  1     Smith,Vincent,      Vincent              Smith      Smith, ,Vincent
  2     Dobbins,Tyrone G    Tyrone       G       Dobbins    Dobbins,G,Tyrone
  3     Smith, John         John                 Smith      Smith, ,John
  4     Dobbins,Tyrone,L    Tyrone       L       Dobbins    Dobbins,L,Tyrone

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 822 views
  • 0 likes
  • 2 in conversation