1. I need to delete 10 digits from right including @ sign on email account and leave only login name for Account ID column (i.e. mjohn)
2. You may also notice – if the names are repeated, email account is attached in parenthesis with their names. I will also need to remove the email account including the parenthesis from Full Name column if the names have their email account attached (i.e. Tom Lee (tlee2@gmail.com) = Tom Lee)
Thanks,
Roger
Input data:
Full_Name Account_ID
------------ --------------
Mary Johnson mjohn@gmail.com
Mary Jonson (mjohn2@gmail.com) mjohn2@gmail.com
Robin King rking@gmail.com
Tom Lee tlee@gmail.com
Tom Lee (tlee2@gmail.com) tlee2@gmail.com
Need the following output:
Full_Name Account_ID
------------ --------------
Mary Johnson mjohn
Mary Jonson mjohn2
Robin King rking
Tom Lee tlee
Tom Lee tlee2
You could try something like:
data have (drop=_:);
input;
_infile_=transtrn(_infile_,"(","");
call scan(_infile_, 2, _position, _length, '@', 'o');
call scan(substr(_infile_,1,_position-2), -1,_position2, _length2);
Full_Name=strip(substr(_infile_,1,_position2-1));
Account_ID=substr(_infile_,_position2,_length2);
cards;
Mary Johnson mjohn@gmail.com
Mary Jonson (mjohn2@gmail.com) mjohn2@gmail.com
Robin King rking@gmail.com
Tom Lee tlee@gmail.com
Tom Lee (tlee2@gmail.com) tlee2@gmail.com
;
Here is a PRX approach:
data have (drop=_:);
length full_name account_id $40.;
_name=prxparse("/[^@\(\)]+ /");
_id=prxparse("/\w+(?=@)/");
input;
call prxsubstr(_name,_infile_,_s1,_l1);
call prxsubstr(_id,_infile_,_s2,_l2);
full_name=substr(_infile_,_s1,_l1);
account_id=substr(_infile_,_s2,_l2);
cards;
Mary Johnson mjohn@gmail.com
Mary Jonson (mjohn2@gmail.com) mjohn2@gmail.com
Robin King rking@gmail.com
Tom Lee tlee@gmail.com
Tom Lee (tlee2@gmail.com) tlee2@gmail.com
;
proc print;run;
Haikuo
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.