Quartz | Level 8

## Scan function for separating out name parts

I have a variable that lists a person's entire name. I would like to extract the first, middle, and last names as well names suffixes (e.g., Jr.) and save them into separate variables. I have figured out how to do everything except pull the full middle name if the middle name is more than one word. An example that is giving me trouble:

Johnson, Ethan Marc Anthony, Jr.

Where Ethan is the first name, Marc Anthony is the middle name, Johnson is the last name, and Jr. is the name suffix.

When I use the following, everything is pulled correctly except middle name:

```lastnm = scan(Name,1,",");
firstmid = scan(Name,2,",");
firstnm = scan(firstmid ,1," ");
midnm= scan(firstmid ,2," ");
nmsuffix = scan(Name,3,",");```

When I run this, the result is:

firstnm = Ethan

midnm = Marc (should be "Marc Anthony")

lastnm = Johnson

nmsuffix = Jr.

Thoughts?

1 ACCEPTED SOLUTION

Accepted Solutions
Obsidian | Level 7

## Re: Scan function for separating out name parts

You can use substr function. Just replace midnm with following statement:

``midnm= substr(firstmid,length(firstnm)+2,length(firstmid)-length(firstnm));``
4 REPLIES 4
Super User

## Re: Scan function for separating out name parts

@raivester wrote:

I have a variable that lists a person's entire name. I would like to extract the first, middle, and last names as well names suffixes (e.g., Jr.) and save them into separate variables. I have figured out how to do everything except pull the full middle name if the middle name is more than one word. An example that is giving me trouble:

Johnson, Ethan Marc Anthony, Jr.

Where Ethan is the first name, Marc Anthony is the middle name, Johnson is the last name, and Jr. is the name suffix.

When I use the following, everything is pulled correctly except middle name:

```lastnm = scan(Name,1,",");
firstmid = scan(Name,2,",");
firstnm = scan(firstmid ,1," ");
midnm= scan(firstmid ,2," ");
nmsuffix = scan(Name,3,",");```

When I run this, the result is:

firstnm = Ethan

midnm = Marc (should be "Marc Anthony")

lastnm = Johnson

nmsuffix = Jr.

Thoughts?

Try something like this:

```data example;
name="Johnson, Ethan Marc Anthony, Jr.";
length lastnm \$ 10 firstmid \$ 30 firstnm midnm \$15 ;
lastnm = scan(Name,1,",");
firstmid = strip(scan(Name,2,","));
firstnm = scan(firstmid ,1," ");
midnm= strip(substr(firstmid ,length(firstnm)+2));
nmsuffix = strip(scan(Name,3,","));
run;```

You didn't show an entire data step so I don't know if you set lengths for your variables. It is a good idea to do so.

The Strip is because the way you build Firstmid it will often have a leading blank, as in your example.

The midnm uses the length function to find the length of the first name then start pulling characters at two after that length to start on the remainder. I include another strip there because, having dealt some name fields like this, there will be garbage characters in there.

Obsidian | Level 7

## Re: Scan function for separating out name parts

You can use substr function. Just replace midnm with following statement:

``midnm= substr(firstmid,length(firstnm)+2,length(firstmid)-length(firstnm));``
Quartz | Level 8

## Re: Scan function for separating out name parts

Ooh! Good thinkin'!
Super User

## Re: Scan function for separating out name parts

```data example;
name="Johnson, Ethan Marc Anthony, Jr.";

lastnm = scan(Name,1,",");
firstmid = scan(Name,2,",");
firstnm = scan(firstmid ,1," ");
midnm= substr(firstmid ,findc(left(firstmid),' ')+1);
nmsuffix = scan(Name,3,",");
run;```
Discussion stats
• 4 replies
• 781 views
• 2 likes
• 4 in conversation