Extracting the word before and after a specific index word

Accepted Solution Solved
Reply
Regular Contributor
Posts: 163
Accepted Solution

Extracting the word before and after a specific index word

Hi I have a string from which i need to extract the word before and after a specific index word (trigger). The string looks like this word word word word word WORD_BEFORE trigger WORD_AFTER word word word word I can extract WORD_AFTER using the following code: WORD_AFTER = scan(substr(string,index(string,"trigger")),2); However, i cant seem to get the code right to extract the WORD_BEFORE, any suggestions please? Kind regards

Accepted Solutions
Solution
‎07-15-2017 08:01 AM
Super User
Super User
Posts: 7,670

Re: Extracting the word before and after a specific index word

Note this assumes only one occurence of the word, otherwise you will get the last one:

data want;
  length before after $200;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
  do i=1 to countw(string," ");
    if scan(string,i," ")="trigger" then do;
      before=scan(string,i-1," ");
      after=scan(string,i+1," ");
    end;
  end;
run;

View solution in original post


All Replies
Solution
‎07-15-2017 08:01 AM
Super User
Super User
Posts: 7,670

Re: Extracting the word before and after a specific index word

Note this assumes only one occurence of the word, otherwise you will get the last one:

data want;
  length before after $200;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
  do i=1 to countw(string," ");
    if scan(string,i," ")="trigger" then do;
      before=scan(string,i-1," ");
      after=scan(string,i+1," ");
    end;
  end;
run;

PROC Star
Posts: 7,428

Re: Extracting the word before and after a specific index word

Here is one way:

data have;
  informat string $80.;
  input string &;
  cards;
now is the time this must stop
Now hear this and do it quickly
;

data want;
  set have;
  WORD_BEFORE = scan(string,findw(string,"this",' ',"e")-1," ");
  WORD_AFTER = scan(string,findw(string,"this",' ',"e")+1," ");
run;

Art, CEO, AnalystFinder.com

 

Valued Guide
Posts: 947

Re: Extracting the word before and after a specific index word

 

If the trigger value is not present then this code would generate word_before=last-word-of-string, and word_after=first-word-of-string, with no attendent NOTEs on the log.

 

And if the trigger appears as the first word, then word_before generates a note,  Or if the word appears only as the last word, then word_after generates a note.

 

If these condition are to be avoided, I'd recommend a minor alteration of @art297's response:

 

data want (drop=ix);
  set have;
  ix=findw(string,"stop",' ','e');
  if ix>1              then WORD_BEFORE = scan(string,ix-1," ");
  if ix<countw(string) and ix^=0 then WORD_AFTER = scan(string,ix+1," ");
run;
PROC Star
Posts: 277

Re: Extracting the word before and after a specific index word

Another way

data have;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
word_before=  prxchange('s/(.+)(WORD_BEFORE)(.+)/$2/', -1, string);
word_after=  prxchange('s/(.+)(WORD_AFTER)(.+)/$2/', -1, string);

run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 183 views
  • 2 likes
  • 5 in conversation