How to divide the whole name column into last name, first name, middle name columns? Thanks!
have:
FIGLIULO,MARGARET A |
WHEELER,TANYA |
WALDRON,VIVIAN A |
want:
LAST NAME | FIRST_NAME | MIDDLE NAME |
FIGLIULO | MARGARET | A |
WHEELER | TANYA | |
WALDRON | VIVIAN | A |
Any exceptions beyond those shown in the code below?
data have;
informat name $80.;
input name &;
cards;
FIGLIULO,MARGARET A
WHEELER,TANYA
WALDRON,VIVIAN A
LEE,KHALID
LEWIS,G. P
MC NAVARRETE,SAUL
OKOCHA,MICHELLE JR
MC RANKIN,EBONY F
SHIRA,PAUL
SINGH,JASBIR
LEWIS, G. P.
DR. MC DERMIT,JOHN,SR.
MRS. FIGLIULO,MARGARET A
MS WHEELER,TANYA JANE
MS WALDRON,VIVIAN A II
MR LEE,KHALID
MS LEWIS,G. P
MR MC NAVARRETE,SAUL
MRS. OKOCHA,MICHELLE JR
MISS MC RANKIN,EBONY F
DR. SHIRA,PAUL
MR. SINGH,JASBIR
MR. LEWIS, G. P.
DR. MC DERMIT,JOHN,III
;
data want (drop=_:);
set have;
_i=1;
_j=0;
name=compress(name,".");
do until (scan(name,_i+_j," ,") eq "");
select (_i);
when (1) do;
last=scan(name,_i+_j," ,");
if last in ("MC") then do;
_j+1;
last=catx(' ',last,scan(name,_i+_j," ,"));
end;
else if last in ("MR","MRS","MS","MISS","DR") then do;
_j+1;
_i=_i-1;
prefix=last;
call missing(last);
end;
end;
when (2) first=scan(name,_i+_j," ,");
otherwise do;
if scan(name,_i+_j," ,") in ("JR","SR","II","III","IV") then
suffix=scan(name,_i+_j," ,");
else middle=scan(name,_i+_j," ,");
end;
end;
_i+1;
end;
run;
One way would be to parse the string using the scan function. e.g.:
data have;
informat name $80.;
input name &;
cards;
FIGLIULO,MARGARET A
WHEELER,TANYA
WALDRON,VIVIAN A
;
data want;
set have;
_n_=1;
do until (scan(name,_n_," ,") eq "");
select (_n_);
when (1) last=scan(name,_n_," ,");
when (2) first=scan(name,_n_," ,");
otherwise middle=scan(name,_n_," ,");
end;
_n_+1;
end;
run;
One more approach using Substr and Index function.
data y1;
input name $ 1-40;
cards;
FIGLIULO,MARGARET Apple
WHEELER,TANYA
WALDRON,VIVIAN A
;
run;
data y2;
set y1;
firstname= substr(name,1,index(name,',')-1);
lastname=substr(name,index(name,',')+1,index(name,' ')-index(name,',')-1);
middlename=substr(name,index(name,' '),index(name,''));
run;
Have you confirmed that all of your data has a comma between names? Do you have titles like Mr Mrs Ms or things like Junior, II, III or IV?
The data is actually more complicate, many types of names, like the ones with MC NAVARRETE, OR MICHELLE JR...
LEE,KHALID |
LEWIS,G. P |
MC NAVARRETE,SAUL |
OKOCHA,MICHELLE JR |
MC RANKIN,EBONY F |
SHIRA,PAUL |
SINGH,JASBIR |
OR LIKE LEWIS,G. P, middle name in the middle, not at last....
Any exceptions beyond those shown in the code below?
data have;
informat name $80.;
input name &;
cards;
FIGLIULO,MARGARET A
WHEELER,TANYA
WALDRON,VIVIAN A
LEE,KHALID
LEWIS,G. P
MC NAVARRETE,SAUL
OKOCHA,MICHELLE JR
MC RANKIN,EBONY F
SHIRA,PAUL
SINGH,JASBIR
LEWIS, G. P.
DR. MC DERMIT,JOHN,SR.
MRS. FIGLIULO,MARGARET A
MS WHEELER,TANYA JANE
MS WALDRON,VIVIAN A II
MR LEE,KHALID
MS LEWIS,G. P
MR MC NAVARRETE,SAUL
MRS. OKOCHA,MICHELLE JR
MISS MC RANKIN,EBONY F
DR. SHIRA,PAUL
MR. SINGH,JASBIR
MR. LEWIS, G. P.
DR. MC DERMIT,JOHN,III
;
data want (drop=_:);
set have;
_i=1;
_j=0;
name=compress(name,".");
do until (scan(name,_i+_j," ,") eq "");
select (_i);
when (1) do;
last=scan(name,_i+_j," ,");
if last in ("MC") then do;
_j+1;
last=catx(' ',last,scan(name,_i+_j," ,"));
end;
else if last in ("MR","MRS","MS","MISS","DR") then do;
_j+1;
_i=_i-1;
prefix=last;
call missing(last);
end;
end;
when (2) first=scan(name,_i+_j," ,");
otherwise do;
if scan(name,_i+_j," ,") in ("JR","SR","II","III","IV") then
suffix=scan(name,_i+_j," ,");
else middle=scan(name,_i+_j," ,");
end;
end;
_i+1;
end;
run;
Thank you very much for your great help!
At least my garbage name data didn't split the MC off of names like MCDERMOT.
Any names like O'DONNEL or O'NEIL?
How about VAN DYKE?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.