Hi,
Could you please help me extract first name of full name?
Data test;
input id fullname $;
1 Michael R. Boyce
2 James G. Brocksmith, Jr.
3 Gerald F. Fitzgerald, Jr.
4 Norman R. Bobins, BS, MBA
5 Peter Pace, USMC, Ret.
6 Ryan, Norman A.
7 Anne Newman Foreman
;
run;
I need to create "first_name" column with middle name and without middle name and "last_name" using the fullname information.
As you can see above, In ID=6, the full name start with the last name unlike other observations. If it is the case, I need to re-order it to be consistent with others.
Like...
Data test; set test;
input ID First_name_with_middle $ Last_name $ First_name_without_middle $
datalines;
1 Michael R. Boyce Michael
2 James G. Brocksmith James
3 Gerald F. Fitzgerald Gerald
4 Norman R. Bobins Norman
5 Peter Pace Peter
6 Norman A. Ryan Norman
7 Anne Newman Foreman Anne
;
run;
That is, the first name with middle name column should contain the middle name abbreviation as well.
And, the last name column only contains last name of the full name.
Please help me!
PRX to the rescue...
Data test;
input id fullname $64.;
datalines;
1 Michael R. Boyce
2 James G. Brocksmith, Jr.
3 Gerald F. Fitzgerald, Jr.
4 Norman R. Bobins, BS, MBA
5 Peter Pace, USMC, Ret.
6 Ryan, Norman A.
7 Anne Newman Foreman
8 Michael R.S. Boyce
9 Ryan, Norman A., BS
;
data want;
length firstName lastName $20 initials $6;
if not prxId1 then prxId1 + prxParse("/^(\w+)\s+(([A-Z]\.)*\s*)([^,]+)/");
if not prxId2 then prxId2 + prxParse("/^(\w+),\s*(\w+)\s+(([A-Z]\.)*)/");
set test;
if prxMatch(prxId1, fullname) then do;
firstName = prxPosn(prxId1, 1, fullname);
initials = prxPosn(prxId1, 2, fullname);
lastName = prxPosn(prxId1, 4, fullname);
end;
else if prxMatch(prxId2, fullname) then do;
firstName = prxPosn(prxId2, 2, fullname);
initials = prxPosn(prxId2, 3, fullname);
lastName = prxPosn(prxId2, 1, fullname);
end;
drop prxId: ;
run;
proc print data=want noobs; run;
first Name lastName initials id fullname Michael Boyce R. 1 Michael R. Boyce James Brocksmith G. 2 James G. Brocksmith, Jr. Gerald Fitzgerald F. 3 Gerald F. Fitzgerald, Jr. Norman Bobins R. 4 Norman R. Bobins, BS, MBA Peter Pace 5 Peter Pace, USMC, Ret. Norman Ryan A. 6 Ryan, Norman A. Anne Newman Foreman 7 Anne Newman Foreman Michael Boyce R.S. 8 Michael R.S. Boyce Norman Ryan A. 9 Ryan, Norman A., BS
I just got confused with variables you want in your output dataset
Sorry. The wanted dataset should look like..
Data test;
infile datalines DSD;
input ID First_name_with_middle $ Last_name $ First_name_without_middle $ ;
datalines;
1, Michael R., Boyce, Michael
2, James G., Brocksmith, James
3, Gerald F., Fitzgerald, Gerald
4, Norman R., Bobins, Norman
5, Peter, Pace, Peter
6, Norman A., Ryan, Norman
7, Anne Newman, Foreman, Anne
;
run;
But, when I ran the code above, First_name_with_middle column shows that some observations do not have a dot after middle name abbreviation. I do not know why... but I need the dot after the middle name abbreviation if there is a middle name abbreviation.
PRX to the rescue...
Data test;
input id fullname $64.;
datalines;
1 Michael R. Boyce
2 James G. Brocksmith, Jr.
3 Gerald F. Fitzgerald, Jr.
4 Norman R. Bobins, BS, MBA
5 Peter Pace, USMC, Ret.
6 Ryan, Norman A.
7 Anne Newman Foreman
8 Michael R.S. Boyce
9 Ryan, Norman A., BS
;
data want;
length firstName lastName $20 initials $6;
if not prxId1 then prxId1 + prxParse("/^(\w+)\s+(([A-Z]\.)*\s*)([^,]+)/");
if not prxId2 then prxId2 + prxParse("/^(\w+),\s*(\w+)\s+(([A-Z]\.)*)/");
set test;
if prxMatch(prxId1, fullname) then do;
firstName = prxPosn(prxId1, 1, fullname);
initials = prxPosn(prxId1, 2, fullname);
lastName = prxPosn(prxId1, 4, fullname);
end;
else if prxMatch(prxId2, fullname) then do;
firstName = prxPosn(prxId2, 2, fullname);
initials = prxPosn(prxId2, 3, fullname);
lastName = prxPosn(prxId2, 1, fullname);
end;
drop prxId: ;
run;
proc print data=want noobs; run;
first Name lastName initials id fullname Michael Boyce R. 1 Michael R. Boyce James Brocksmith G. 2 James G. Brocksmith, Jr. Gerald Fitzgerald F. 3 Gerald F. Fitzgerald, Jr. Norman Bobins R. 4 Norman R. Bobins, BS, MBA Peter Pace 5 Peter Pace, USMC, Ret. Norman Ryan A. 6 Ryan, Norman A. Anne Newman Foreman 7 Anne Newman Foreman Michael Boyce R.S. 8 Michael R.S. Boyce Norman Ryan A. 9 Ryan, Norman A., BS
Just one more quick question.
If I have the fullname.. like...
Lavizzo-Mourey, Evans
B. Evan Bayh, III
Those names are not correctly divided into firstname and lastname using the above code.
Thank you!
The possibilities are endless... You can account for hyphenated names and firstnames easily, but frankly I don't even know what's what in "B. Evan Bayh, III". If such "names" are rare, you might be better sending them to a separate table and treating them by hand...
Data test;
input id fullname $64.;
datalines;
1 Michael R. Boyce
2 James G. Brocksmith, Jr.
3 Gerald F. Fitzgerald, Jr.
4 Norman R. Bobins, BS, MBA
5 Peter Pace, USMC, Ret.
6 Ryan, Norman A.
7 Anne Newman Foreman
8 Michael R.S. Boyce
9 Ryan, Norman A., BS
10 Lavizzo-Mourey, Evans
11 B. Evan Bayh, III
;
data want except(keep=id fullname);
length firstName lastName $20 initials $6;
if not prxId1 then prxId1 + prxParse("/^([\w-]+)\s+(([A-Z]\.)*\s*)([^,]+)/");
if not prxId2 then prxId2 + prxParse("/^([\w-]+),\s*([\w-]+)\s+(([A-Z]\.)*)/");
set test;
if prxMatch(prxId1, fullname) then do;
firstName = prxPosn(prxId1, 1, fullname);
initials = prxPosn(prxId1, 2, fullname);
lastName = prxPosn(prxId1, 4, fullname);
output want;
end;
else if prxMatch(prxId2, fullname) then do;
firstName = prxPosn(prxId2, 2, fullname);
initials = prxPosn(prxId2, 3, fullname);
lastName = prxPosn(prxId2, 1, fullname);
output want;
end;
else output except;
drop prxId: ;
run;
title "Matched names";
proc print data=want noobs; run;
title "Unmatched names";
proc print data=except noobs; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.