BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sumitbe139
Fluorite | Level 6

HI all,

 

I am working with a data sets which has names like 

 

Nichols, KregFI

 

its in the form "lastname,(space)firstnamemiddlename"

I accept my o/p to be like this :

firstname=kerg

middlename=Fl

lastname=Nichols

 

the code for that:

 

data b;

str='Nichols, KregFI';   /*** String **/

a=index(str,",")+2;     /** for finding the position of first Name **/

first_name=substr(scan(str,2,","),1,a-1);   /** for first name only**/

run;

 

But ,I am getting o/p :

 

s.no        str                        a           first_name

1Nichols, KregFI10KregFI

 

 

please correct and assist me for that.

 

thanks,

1 ACCEPTED SOLUTION

Accepted Solutions
GinaRepole
SAS Employee

Very similar to what Astounding just said, but I'd add the space character into the scan delimiter listing as well just to confirm that the space doesn't offset our numbers. Here's my thought process to help you understand the code in another way:

 

Last_Name = SCAN(name, 1, ",");
First_Middle = SCAN(name, 2, ", ");
First_Name = SUBSTR(First_Middle, 1, LENGTH(First_Middle)-2);
Middle_Init = SUBSTR(First_Middle, LENGTH(First_Middle)-1, 2);

 

Last name is easiest to grab; you've already got that. Then you can take the cluster of First and Middle initials together. First name is the substring of everything EXCEPT those last two characters, middle initials are ONLY the last two characters. (These assignments should also be inside a data step, to be clear)

View solution in original post

4 REPLIES 4
GinaRepole
SAS Employee
What is the rule defining the split between the first name and middle name?

Is the middle name a two letter abbreviation?
Is the middle name always going to begin with the first capital letter after the first name starts?

Either one is possible to correct, but if the data isn't consistent with either of these styles I'm not sure it's possible (given there is no separating character).
sumitbe139
Fluorite | Level 6
thank you for your response,

the first name & middle name has no split in between.

yes, the middle name is two letter abbreviation.

yes, its start with the first capital letter as the first name ends.


Astounding
PROC Star

It's probably easier to start this way:

 

last_name = scan(str, 1, ',');

first_name = scan(str, 2, ',');

 

At this point, FIRST_NAME still has the initials at the end.  But you should be able to find a way to split them off into a separate variable, such as:

 

middle_name = substr(first_name, length(first_name)-1);

first_name = substr(first_name, 1, length(first_name)-2);

GinaRepole
SAS Employee

Very similar to what Astounding just said, but I'd add the space character into the scan delimiter listing as well just to confirm that the space doesn't offset our numbers. Here's my thought process to help you understand the code in another way:

 

Last_Name = SCAN(name, 1, ",");
First_Middle = SCAN(name, 2, ", ");
First_Name = SUBSTR(First_Middle, 1, LENGTH(First_Middle)-2);
Middle_Init = SUBSTR(First_Middle, LENGTH(First_Middle)-1, 2);

 

Last name is easiest to grab; you've already got that. Then you can take the cluster of First and Middle initials together. First name is the substring of everything EXCEPT those last two characters, middle initials are ONLY the last two characters. (These assignments should also be inside a data step, to be clear)

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2076 views
  • 3 likes
  • 3 in conversation