BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mrahouma
Obsidian | Level 7

I have the following dataset from which I need help to extract middle name from heterogeneous variable with different words count.

I saw prior link but it does not work for me.

my data has only 1 variable

 

Data statisticians;
infile datalines ;
Input name $30. ;
Datalines;
Ronaldo Al Fisher
H. O. Meir
Lee Sara Kim Ivan
Marco Sina
;
run;

 

 

Names with first and last names only should be blank in the new column called middle.

I tried the following code but it does not work:

data statisticians;  length middle $10; set statisticians; if
 count = 2 then middle=.; if count = 3 then middle= scan(name,2);   
if count = 4 then middle=scan(name,2);  run;

Any help will be greatly appreciated.

 

1 ACCEPTED SOLUTION

Accepted Solutions
4 REPLIES 4
Kurt_Bremser
Super User
data statisticians;
infile datalines;
input name $30.;
length middle $10;
if countw(name) > 2 then middle = scan(name,2);
datalines;
Ronaldo Al Fisher
H. O. Meir
Lee Sara Kim Ivan
Marco Sina
;
Shmuel
Garnet | Level 18

I don't think there is a rule to define which string in a four string name is the real middle name.

If you can suggest such a rule then you can adapt the code you got.

ballardw
Super User

@mrahouma wrote:

I have the following dataset from which I need help to extract middle name from heterogeneous variable with different words count.

I saw prior link but it does not work for me.

my data has only 1 variable

 

Data statisticians;
infile datalines ;
Input name $30. ;
Datalines;
Ronaldo Al Fisher
H. O. Meir
Lee Sara Kim Ivan
Marco Sina
;
run;

 

 

Names with first and last names only should be blank in the new column called middle.

I tried the following code but it does not work:

data statisticians;  length middle $10; set statisticians; if
 count = 2 then middle=.; if count = 3 then middle= scan(name,2);   
if count = 4 then middle=scan(name,2);  run;

Any help will be greatly appreciated.

 


Are you 100 percent sure that your data does not have an last names that are two or more words such as "Van Dyke" "De La Cruz" without middle names present?

Or possibly have only a last name (or first)?

Or have added bits like titles (Dr John Doe, Mrs Jane Roe) or indications like "John Smith II" or "John Smith Junior" or "John Smith the Third"?

You may want to bring anything with a count > 5 to personal attention.

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1262 views
  • 2 likes
  • 4 in conversation