****
DATAaddress;
INPUT#1@1line1 $50.
#2@1line2 $50.
#3@1line3 $50.;
DATALINES;
Mr. Jason Simmons
123 Sesame Street
Madison, WI
Dr. Justin Case
78 River Road
Flemington, NJ
Ms. Marilyn Crow
777 Jewell Avenue
Pittsburgh, PA
;
RUN;
****
Using the above code, I need to create a new variable called name that contains only the first and last names separated by a comma. For the life of me I cannot seem to figure this one out. Any and all help would be greatly appreciated!
*By the way, I'm using SAS university edition.
UPCASE() to convert to upper case and for find replace you can use index() to find and tranwrd() to replace.
DATA address;
INPUT #1@1 line1 $50.
#2@1 line2 $50.
#3@1 line3 $50.;
name=catx(',',scan(line1,2),scan(line1,3));
Street=upcase(line2);
if index(street,'STREET') then street=tranwrd(street,'STREET','ST.');
if index(street,'ROAD') then street=tranwrd(street,'ROAD','RD.');
if index(street,'AVENUE') then street=tranwrd(street,'AVENUE','AVE.');
DATALINES;
Mr. Jason Simmons
123 Sesame Street
Madison, WI
Dr. Justin Case
78 River Road
Flemington, NJ
Ms. Marilyn Crow
777 Jewell Avenue
Pittsburgh, PA
;
RUN;
Given from your data, if the names are always like ( prefix first name last name ), then using scan will work.
name=catx(',',scan(line1,2),scan(line1,3));
DATA address;
INPUT #1@1 line1 $50.
#2@1 line2 $50.
#3@1 line3 $50.;
name=catx(',',scan(line1,2),scan(line1,3));
DATALINES;
Mr. Jason Simmons
123 Sesame Street
Madison, WI
Dr. Justin Case
78 River Road
Flemington, NJ
Ms. Marilyn Crow
777 Jewell Avenue
Pittsburgh, PA
;
RUN;
Thanks so much SuryaKiran!
An additional question is how would I go about creating a new variable called street so that it contains the street name in only uppercase letters while also replacing Road/Street/Avenue with Rd./St./Ave.?
UPCASE() to convert to upper case and for find replace you can use index() to find and tranwrd() to replace.
DATA address;
INPUT #1@1 line1 $50.
#2@1 line2 $50.
#3@1 line3 $50.;
name=catx(',',scan(line1,2),scan(line1,3));
Street=upcase(line2);
if index(street,'STREET') then street=tranwrd(street,'STREET','ST.');
if index(street,'ROAD') then street=tranwrd(street,'ROAD','RD.');
if index(street,'AVENUE') then street=tranwrd(street,'AVENUE','AVE.');
DATALINES;
Mr. Jason Simmons
123 Sesame Street
Madison, WI
Dr. Justin Case
78 River Road
Flemington, NJ
Ms. Marilyn Crow
777 Jewell Avenue
Pittsburgh, PA
;
RUN;
Thank you so very much SuryaKiran
Please note that you will want to check some of the results carefully as names, both people and streets are some of the worst data out there in common usage.
You can have last names like "Le Blanc" "von Richtofen". In which case scan for third word in something like
Mr. Pierre Le Blanc
will yield Le as the last name, missing the Blanc part. Language and culture or origin can make this even more complicated.
And street names like "Highway to the Gods" have the same issue. And that is if you actually have the luxury of always having a number preceding the street name.
You might see you can actually get you input data in some form of delimited data file. Hopefully the original collector did not put all of the name information in a single piece of text (proper data entry for names into separate fields for first, last , middle name and form of address (Mr Mrs Ms Dr Senator what have you) has been known for a long time.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.