BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mrahouma
Obsidian | Level 7

I have the following dataset from which I need help to extract middle name from heterogeneous variable with different words count.

I saw prior link but it does not work for me.

my data has only 1 variable

 

Data statisticians;
infile datalines ;
Input name $30. ;
Datalines;
Ronaldo Al Fisher
H. O. Meir
Lee Sara Kim Ivan
Marco Sina
;
run;

 

 

Names with first and last names only should be blank in the new column called middle.

I tried the following code but it does not work:

data statisticians;  length middle $10; set statisticians; if
 count = 2 then middle=.; if count = 3 then middle= scan(name,2);   
if count = 4 then middle=scan(name,2);  run;

Any help will be greatly appreciated.

 

1 ACCEPTED SOLUTION

Accepted Solutions
4 REPLIES 4
Kurt_Bremser
Super User
data statisticians;
infile datalines;
input name $30.;
length middle $10;
if countw(name) > 2 then middle = scan(name,2);
datalines;
Ronaldo Al Fisher
H. O. Meir
Lee Sara Kim Ivan
Marco Sina
;
Shmuel
Garnet | Level 18

I don't think there is a rule to define which string in a four string name is the real middle name.

If you can suggest such a rule then you can adapt the code you got.

ballardw
Super User

@mrahouma wrote:

I have the following dataset from which I need help to extract middle name from heterogeneous variable with different words count.

I saw prior link but it does not work for me.

my data has only 1 variable

 

Data statisticians;
infile datalines ;
Input name $30. ;
Datalines;
Ronaldo Al Fisher
H. O. Meir
Lee Sara Kim Ivan
Marco Sina
;
run;

 

 

Names with first and last names only should be blank in the new column called middle.

I tried the following code but it does not work:

data statisticians;  length middle $10; set statisticians; if
 count = 2 then middle=.; if count = 3 then middle= scan(name,2);   
if count = 4 then middle=scan(name,2);  run;

Any help will be greatly appreciated.

 


Are you 100 percent sure that your data does not have an last names that are two or more words such as "Van Dyke" "De La Cruz" without middle names present?

Or possibly have only a last name (or first)?

Or have added bits like titles (Dr John Doe, Mrs Jane Roe) or indications like "John Smith II" or "John Smith Junior" or "John Smith the Third"?

You may want to bring anything with a count > 5 to personal attention.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 511 views
  • 2 likes
  • 4 in conversation