Hello,
I am trying to use the scan function to distinguish between first and last name, which are separated by a longer space. For example:
HOGUE STAUBIN BENOIT G
The following code
BOR1_LASTNAME = trim(scan(BOR1_FULL_NAME_COR,1," "));
Breaks it down by:
HOGUE
When the result I would want is: HOGUE STAUBIN
Note that there's some instances where the last name is composed (two names) and sometimes single, but the first and last name(s) are always separated by a longer space.
Any insights?
Can you post more data as text? I won't open ms office files.
The following regular expression can help solving the problem:
data have;
text1 = "HOGUE STAUBIN BENOIT G";
rx = prxparse('/(\w+ \w*) +.*/');
if prxmatch(rx, trim(text1)) then do;
last_name = prxposn(rx, 1, text1);
end;
run;
Look into the modifiers (m) for the SCAN() function.
@camfarrell25 wrote:
Hello,
I am trying to use the scan function to distinguish between first and last name, which are separated by a longer space. For example:
HOGUE STAUBIN BENOIT G
The following code
BOR1_LASTNAME = trim(scan(BOR1_FULL_NAME_COR,1," "));
Breaks it down by:
HOGUE
When the result I would want is: HOGUE STAUBIN
Note that there's some instances where the last name is composed (two names) and sometimes single, but the first and last name(s) are always separated by a longer space.
Any insights?
I tried the following and no luck..
BOR1_LASTNAME = trim(scan(BOR1_FULL_NAME_COR,1,'09'x));
Is this from a text file? Can you upload a small sample? Pasting as text may remove issues in the actual data unfortunately.
It's not pulled from a text file - see attached. Essentially the logic that would work is to be able to separate the first and last name in two if the space between them exceeds one, and then further breaking it down into first last name, and second last name based on the presence of a single space.
Is there a way?
@camfarrell25 wrote:
It's not pulled from a text file - see attached. Essentially the logic that would work is to be able to separate the first and last name in two if the space between them exceeds one, and then further breaking it down into first last name, and second last name based on the presence of a single space.
Is there a way?
I'm assuming you're actually asking how, rather than if it's possible, otherwise the answer is yes.
Use FINDW to find a string of two spaces, then use SUBSTR at that location.
I also used STRIP to remove any leading spaces from the second portion.
data have;
text1 = "HOGUE STAUBIN BENOIT G";
x=findw(text1, " ");
p1 = substr(text1, 1, x);
p2 = strip(substr(text1, x+1));
run;
Can you post more data as text? I won't open ms office files.
The following regular expression can help solving the problem:
data have;
text1 = "HOGUE STAUBIN BENOIT G";
rx = prxparse('/(\w+ \w*) +.*/');
if prxmatch(rx, trim(text1)) then do;
last_name = prxposn(rx, 1, text1);
end;
run;
Hello,
I'm sorry - find attach text file.
CF
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.