Using PROC SQL, is it possible to remove the Middle Initial (if it is present) from the created First_Name variable in the SQL below?
Example:
Smith, John M
SQL below returns
Smith, John M
John M
Smith
Need to remove the middle initial 'M' where present from first_name
Here is what I have so far:
Select
FULL_NAME
,scan(FULL_NAME,-1,',') as first_name
,scan(FULL_NAME,1,',') as last_name
FROM
SAMPLE_TABLE
Thanks in advance.
Your code was close:
data sample_table;
informat full_name $50.;
input full_name &;
cards;
Smith, John M
Smith, Joe
Smith, Mary Margaret
Smit, Margaret M
;
proc sql;
create table want as
Select
FULL_NAME,
scan(FULL_NAME,1,' ,') as last_name,
scan(FULL_NAME,1-countw(full_name),' ,') as first_name
from SAMPLE_TABLE
;
quit;
HTH,
Art, CEO, AnalystFinder.com
It depends on how consistently the names are formatted but you could try a regex. The below is not tested, but the idea to find a space followed by a single character and remove it before using the scan.
scan(prxchange('s/\s\w$//', 1, strip(FULL_NAME)), -1, ',') as first_name
Your code was close:
data sample_table;
informat full_name $50.;
input full_name &;
cards;
Smith, John M
Smith, Joe
Smith, Mary Margaret
Smit, Margaret M
;
proc sql;
create table want as
Select
FULL_NAME,
scan(FULL_NAME,1,' ,') as last_name,
scan(FULL_NAME,1-countw(full_name),' ,') as first_name
from SAMPLE_TABLE
;
quit;
HTH,
Art, CEO, AnalystFinder.com
proc sql ;
create table result as
select scan( source,1,',' ) as surname length=12
, case when length( scan( source, -1, ' ' ) ) > 1
then scan( source, -1, ' ' )
when length( scan( source, -2, ' ' ) ) > 1
then scan( source, -2, ' ' )
else scan( source, -3, ' ' )
end as forename length= 12
from source_data
;
quit ;
Thank you all for the help! That did the trick!
Regular expressions are most flexible
data a;
name = "George W. Bush"; output;
name = "George W Bush"; output;
name = "George Bush"; output;
run;
proc sql;
select name,
prxChange("s/(\w+\s)(\s*\w\.?\s+)(\w+)/\1\3/io",1,name) as shortName
from a;
quit;
Remember: prxChange returns the original string when the pattern is not found.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.