BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ammarhm
Lapis Lazuli | Level 10
Hi I have a string from which i need to extract the word before and after a specific index word (trigger). The string looks like this word word word word word WORD_BEFORE trigger WORD_AFTER word word word word I can extract WORD_AFTER using the following code: WORD_AFTER = scan(substr(string,index(string,"trigger")),2); However, i cant seem to get the code right to extract the WORD_BEFORE, any suggestions please? Kind regards
1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Note this assumes only one occurence of the word, otherwise you will get the last one:

data want;
  length before after $200;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
  do i=1 to countw(string," ");
    if scan(string,i," ")="trigger" then do;
      before=scan(string,i-1," ");
      after=scan(string,i+1," ");
    end;
  end;
run;

View solution in original post

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Note this assumes only one occurence of the word, otherwise you will get the last one:

data want;
  length before after $200;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
  do i=1 to countw(string," ");
    if scan(string,i," ")="trigger" then do;
      before=scan(string,i-1," ");
      after=scan(string,i+1," ");
    end;
  end;
run;

DR_Jorg
Fluorite | Level 6

Hi,

This is great! Thank you.

I am wondering how you would do this iteratively. I am trying to parse the word before a trigger word, but sometimes the trigger word appears more than once. I need to be able to pull it each time the trigger word appears. The word before is different every time, but the trigger is the same.

 

something like:

 string="word word word word word WORD_BEFORE_1 trigger WORD_AFTER word word word word WORD_BEFORE_2 trigger WORD_AFTER WORD_BEFORE_3 trigger WORD_AFTER";

 

 

WORD_BEFORE_1=

WORD_BEFORE_2=

WORD_BEFORE_3=

etc.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

It is a good idea to start a new topic, provided test data and required output.  This topic is closed, and over a year old.  I would suggest a simple loop, i=1 to scan(your_string), and check each word within the loop.

art297
Opal | Level 21

Here is one way:

data have;
  informat string $80.;
  input string &;
  cards;
now is the time this must stop
Now hear this and do it quickly
;

data want;
  set have;
  WORD_BEFORE = scan(string,findw(string,"this",' ',"e")-1," ");
  WORD_AFTER = scan(string,findw(string,"this",' ',"e")+1," ");
run;

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star

 

If the trigger value is not present then this code would generate word_before=last-word-of-string, and word_after=first-word-of-string, with no attendent NOTEs on the log.

 

And if the trigger appears as the first word, then word_before generates a note,  Or if the word appears only as the last word, then word_after generates a note.

 

If these condition are to be avoided, I'd recommend a minor alteration of @art297's response:

 

data want (drop=ix);
  set have;
  ix=findw(string,"stop",' ','e');
  if ix>1              then WORD_BEFORE = scan(string,ix-1," ");
  if ix<countw(string) and ix^=0 then WORD_AFTER = scan(string,ix+1," ");
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
kiranv_
Rhodochrosite | Level 12

Another way

data have;
  string="word word word word word WORD_BEFORE trigger WORD_AFTER word word word word";
word_before=  prxchange('s/(.+)(WORD_BEFORE)(.+)/$2/', -1, string);
word_after=  prxchange('s/(.+)(WORD_AFTER)(.+)/$2/', -1, string);

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 20786 views
  • 2 likes
  • 6 in conversation