BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ammarhm
Lapis Lazuli | Level 10
Hi I have a string from which i need to extract the word before and after a specific index word (trigger). The string looks like this word word word word word WORD_BEFORE trigger WORD_AFTER word word word word I can extract WORD_AFTER using the following code: WORD_AFTER = scan(substr(string,index(string,"trigger")),2); However, i cant seem to get the code right to extract the WORD_BEFORE, any suggestions please? Kind regards
1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Note this assumes only one occurence of the word, otherwise you will get the last one:

data want;
  length before after $200;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
  do i=1 to countw(string," ");
    if scan(string,i," ")="trigger" then do;
      before=scan(string,i-1," ");
      after=scan(string,i+1," ");
    end;
  end;
run;

View solution in original post

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Note this assumes only one occurence of the word, otherwise you will get the last one:

data want;
  length before after $200;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
  do i=1 to countw(string," ");
    if scan(string,i," ")="trigger" then do;
      before=scan(string,i-1," ");
      after=scan(string,i+1," ");
    end;
  end;
run;

DR_Jorg
Fluorite | Level 6

Hi,

This is great! Thank you.

I am wondering how you would do this iteratively. I am trying to parse the word before a trigger word, but sometimes the trigger word appears more than once. I need to be able to pull it each time the trigger word appears. The word before is different every time, but the trigger is the same.

 

something like:

 string="word word word word word WORD_BEFORE_1 trigger WORD_AFTER word word word word WORD_BEFORE_2 trigger WORD_AFTER WORD_BEFORE_3 trigger WORD_AFTER";

 

 

WORD_BEFORE_1=

WORD_BEFORE_2=

WORD_BEFORE_3=

etc.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

It is a good idea to start a new topic, provided test data and required output.  This topic is closed, and over a year old.  I would suggest a simple loop, i=1 to scan(your_string), and check each word within the loop.

art297
Opal | Level 21

Here is one way:

data have;
  informat string $80.;
  input string &;
  cards;
now is the time this must stop
Now hear this and do it quickly
;

data want;
  set have;
  WORD_BEFORE = scan(string,findw(string,"this",' ',"e")-1," ");
  WORD_AFTER = scan(string,findw(string,"this",' ',"e")+1," ");
run;

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star

 

If the trigger value is not present then this code would generate word_before=last-word-of-string, and word_after=first-word-of-string, with no attendent NOTEs on the log.

 

And if the trigger appears as the first word, then word_before generates a note,  Or if the word appears only as the last word, then word_after generates a note.

 

If these condition are to be avoided, I'd recommend a minor alteration of @art297's response:

 

data want (drop=ix);
  set have;
  ix=findw(string,"stop",' ','e');
  if ix>1              then WORD_BEFORE = scan(string,ix-1," ");
  if ix<countw(string) and ix^=0 then WORD_AFTER = scan(string,ix+1," ");
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
kiranv_
Rhodochrosite | Level 12

Another way

data have;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
word_before=  prxchange('s/(.+)(WORD_BEFORE)(.+)/$2/', -1, string);
word_after=  prxchange('s/(.+)(WORD_AFTER)(.+)/$2/', -1, string);

run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 23065 views
  • 2 likes
  • 6 in conversation