I have names that are "last name, first name". Some have a middle initial and some have "Jr". The middle initial is always after the first name separated by a space and the "Jr" is always after the last name separated by a space. How can I split this in 4 different columns? fname, lname, mname, cadency
data have;
input FullName $50.;
datalines;
Smith, John
Smith Jr, John
Smith III, John
Smith, Jane A
Smith, John J
;
@Reeza wrote:
Look up SCAN() function. But how will you tell the difference between cadence and middle name? That one may require you to provide a list of possible values as it can be pretty long. FYI - if you have SAS Data Quality Studio I think it does this for you.
Thanks! Scan worked pretty well. This is what I came up with:
data want; set have; fname=scan(FullName, 2, ','); lname=scan(FullName, 1, ',') ; keep FullName fname lname; run; data want2; set want; f_count=countw(fname, ' '); l_count=countw(lname, ' '); first_name=scan(fname, 1, ' '); last_name=scan(lname, 1, ' '); if f_count=2 then middle_name=scan(fname, -1, ' '); else middle_name=' '; if l_count=2 then cadency=scan(lname, -1, ' '); else cadency=' '; keep FullName first_name middle_name last_name cadency; run;
What output do you expect for the input supplied?
@Reeza wrote:
Look up SCAN() function. But how will you tell the difference between cadence and middle name? That one may require you to provide a list of possible values as it can be pretty long. FYI - if you have SAS Data Quality Studio I think it does this for you.
Thanks! Scan worked pretty well. This is what I came up with:
data want; set have; fname=scan(FullName, 2, ','); lname=scan(FullName, 1, ',') ; keep FullName fname lname; run; data want2; set want; f_count=countw(fname, ' '); l_count=countw(lname, ' '); first_name=scan(fname, 1, ' '); last_name=scan(lname, 1, ' '); if f_count=2 then middle_name=scan(fname, -1, ' '); else middle_name=' '; if l_count=2 then cadency=scan(lname, -1, ' '); else cadency=' '; keep FullName first_name middle_name last_name cadency; run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.