Solved
Contributor
Posts: 42

# more name string

How to divide the whole name column into last name, first name, middle name columns? Thanks!

have:

 FIGLIULO,MARGARET A WHEELER,TANYA WALDRON,VIVIAN A

want:

 LAST NAME FIRST_NAME MIDDLE NAME FIGLIULO MARGARET A WHEELER TANYA WALDRON VIVIAN A

Accepted Solutions
Solution
‎09-09-2013 01:53 PM
PROC Star
Posts: 8,167

## Re: more name string

Any exceptions beyond those shown in the code below?

data have;

informat name \$80.;

input name &;

cards;

FIGLIULO,MARGARET A

WHEELER,TANYA

WALDRON,VIVIAN A

LEE,KHALID

LEWIS,G. P

MC NAVARRETE,SAUL

OKOCHA,MICHELLE JR

MC RANKIN,EBONY F

SHIRA,PAUL

SINGH,JASBIR

LEWIS, G. P.

DR. MC DERMIT,JOHN,SR.

MRS. FIGLIULO,MARGARET A

MS WHEELER,TANYA JANE

MS WALDRON,VIVIAN A II

MR LEE,KHALID

MS LEWIS,G. P

MR MC NAVARRETE,SAUL

MRS. OKOCHA,MICHELLE JR

MISS MC RANKIN,EBONY F

DR. SHIRA,PAUL

MR. SINGH,JASBIR

MR. LEWIS, G. P.

DR. MC DERMIT,JOHN,III

;

data want (drop=_;

set have;

_i=1;

_j=0;

name=compress(name,".");

do until (scan(name,_i+_j," ,") eq "");

select (_i);

when (1) do;

last=scan(name,_i+_j," ,");

if last in ("MC") then do;

_j+1;

last=catx(' ',last,scan(name,_i+_j," ,"));

end;

else if last in ("MR","MRS","MS","MISS","DR") then do;

_j+1;

_i=_i-1;

prefix=last;

call missing(last);

end;

end;

when (2) first=scan(name,_i+_j," ,");

otherwise do;

if scan(name,_i+_j," ,") in ("JR","SR","II","III","IV") then

suffix=scan(name,_i+_j," ,");

else middle=scan(name,_i+_j," ,");

end;

end;

_i+1;

end;

run;

All Replies
PROC Star
Posts: 8,167

## Re: more name string

One way would be to parse the string using the scan function.  e.g.:

data have;

informat name \$80.;

input name &;

cards;

FIGLIULO,MARGARET A

WHEELER,TANYA

WALDRON,VIVIAN A

;

data want;

set have;

_n_=1;

do until (scan(name,_n_," ,") eq "");

select (_n_);

when (1) last=scan(name,_n_," ,");

when (2) first=scan(name,_n_," ,");

otherwise middle=scan(name,_n_," ,");

end;

_n_+1;

end;

run;

Super Contributor
Posts: 308

Contributor
Posts: 22

## Re: more name string

One more approach using Substr and Index function.

data y1;

input name \$ 1-40;

cards;

FIGLIULO,MARGARET Apple

WHEELER,TANYA

WALDRON,VIVIAN A

;

run;

data y2;

set y1;

firstname= substr(name,1,index(name,',')-1);

lastname=substr(name,index(name,',')+1,index(name,' ')-index(name,',')-1);

middlename=substr(name,index(name,' '),index(name,''));

run;

Super User
Posts: 13,583

## Re: more name string

Have you confirmed that all of your data has a comma between names? Do you have titles like Mr Mrs Ms or things like Junior, II, III or IV?

Contributor
Posts: 42

## Re: more name string

The data is actually more complicate, many types of names, like the ones with MC NAVARRETE, OR MICHELLE JR...

 LEE,KHALID LEWIS,G. P MC NAVARRETE,SAUL OKOCHA,MICHELLE JR MC RANKIN,EBONY F SHIRA,PAUL SINGH,JASBIR
Contributor
Posts: 42

## Re: more name string

OR LIKE LEWIS,G. P, middle name in the middle, not at last....

Solution
‎09-09-2013 01:53 PM
PROC Star
Posts: 8,167

## Re: more name string

Any exceptions beyond those shown in the code below?

data have;

informat name \$80.;

input name &;

cards;

FIGLIULO,MARGARET A

WHEELER,TANYA

WALDRON,VIVIAN A

LEE,KHALID

LEWIS,G. P

MC NAVARRETE,SAUL

OKOCHA,MICHELLE JR

MC RANKIN,EBONY F

SHIRA,PAUL

SINGH,JASBIR

LEWIS, G. P.

DR. MC DERMIT,JOHN,SR.

MRS. FIGLIULO,MARGARET A

MS WHEELER,TANYA JANE

MS WALDRON,VIVIAN A II

MR LEE,KHALID

MS LEWIS,G. P

MR MC NAVARRETE,SAUL

MRS. OKOCHA,MICHELLE JR

MISS MC RANKIN,EBONY F

DR. SHIRA,PAUL

MR. SINGH,JASBIR

MR. LEWIS, G. P.

DR. MC DERMIT,JOHN,III

;

data want (drop=_;

set have;

_i=1;

_j=0;

name=compress(name,".");

do until (scan(name,_i+_j," ,") eq "");

select (_i);

when (1) do;

last=scan(name,_i+_j," ,");

if last in ("MC") then do;

_j+1;

last=catx(' ',last,scan(name,_i+_j," ,"));

end;

else if last in ("MR","MRS","MS","MISS","DR") then do;

_j+1;

_i=_i-1;

prefix=last;

call missing(last);

end;

end;

when (2) first=scan(name,_i+_j," ,");

otherwise do;

if scan(name,_i+_j," ,") in ("JR","SR","II","III","IV") then

suffix=scan(name,_i+_j," ,");

else middle=scan(name,_i+_j," ,");

end;

end;

_i+1;

end;

run;

Contributor
Posts: 42

## Re: more name string

Thank you very much for your great help!

Super User
Posts: 13,583

## Re: more name string

At least my garbage name data didn't split the MC off of names like MCDERMOT.

Any names like O'DONNEL or O'NEIL?

How about VAN DYKE?

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
• 9 replies
• 410 views
• 0 likes
• 5 in conversation