BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bkq32
Quartz | Level 8

I have a variable that has a first name and last name embedded in it. How can I extract the name into a new variable, assuming it always comes after "Requestor: " and before "Reason:"?

 

data have; 
 length note $200.;
 input note $ ;
cards;
This_is_the_Requestor:_Mary_L._Santiago_Reason:
Additional_text_This_is_the_Requestor:_John_Doe_Reason:_stuff
This_is_the_Requestor:_Tyler_Nguyen_Reason:_text_text
;
run;

data have;
 set have;
 note = tranwrd ( note, "_", " " );
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Sorry, it seems that I had some misunderstanding;

 

try next code:

data have; 
 length note $200.;
 input note $ ;
 note = tranwrd ( note, "_", " " );
cards;
This_is_the_Requestor:_Mary_L._Santiago_Reason:
Additional_text_This_is_the_Requestor:_John_Doe_Reason:_stuff
This_is_the_Requestor:_Tyler_Nguyen_Reason:_text_text
;
run;

data want;
 set have;
     pos1 = findw(note,"Requestor:");
     pos2 = findw(note,"Reason:");
     fullname = strip(substr(note,pos1+10,pos2-pos1-10));
drop pos1 pos2; run;

View solution in original post

4 REPLIES 4
Shmuel
Garnet | Level 18

The delimiter between the two names id ":" then:

first_name = scan(note,1,":");
second_name = scan(note,2,":");
Shmuel
Garnet | Level 18

Sorry, it seems that I had some misunderstanding;

 

try next code:

data have; 
 length note $200.;
 input note $ ;
 note = tranwrd ( note, "_", " " );
cards;
This_is_the_Requestor:_Mary_L._Santiago_Reason:
Additional_text_This_is_the_Requestor:_John_Doe_Reason:_stuff
This_is_the_Requestor:_Tyler_Nguyen_Reason:_text_text
;
run;

data want;
 set have;
     pos1 = findw(note,"Requestor:");
     pos2 = findw(note,"Reason:");
     fullname = strip(substr(note,pos1+10,pos2-pos1-10));
drop pos1 pos2; run;
andreas_lds
Jade | Level 19

As soon as personal names are involved, sooner or later things will get interesting. Without a proper delimiter between first-name and last-name this problem can't be solved, because the number of words forming first-name and last-name is rarely two (one word for each), starting the interesting part: from the second to the next-to-last word you have to decide, whether the word belongs to first-name or last-name. This is pure guesswork and will not lead to usable results.

 

The following step extracts the names and stores in in one variable:

data want;
   set have;
   
   length FullName $ 40 rxName 8;
   retain rxName;
   drop rxName;
   
   if _n_ = 1 then do;
      rxName = prxparse('/requestor:_(.+)_reason/i');
   end;
   
   if prxmatch(rxName, note) then do;
      FullName = prxposn(rxName, 1, note);
      FullName = translate(FullName, ' ', '_');
   end;
run;
Kurt_Bremser
Super User

Try this:

data want;
set have;
length name $30;
note = scan(note,2,':');
call scan(note,-1,pos,le,'_','m');
name = translate(substr(note,2,pos - 2),' ','_');
keep name;
run;