BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8

This is a sample of what the data looks like:

 

data have;
input text $50.;
cards;
sponsor withdrawn
withdrawn by sponsor
decision taken by sponsor to withdraw
sponsor decided to withdraw
;
run;

 

The key words are: decision/decide, sponsor, withdrawn

 

How to efficiently search for key words in anyorder they appear?

 

 

Note: Edited by Reeza for clarity and legibility.

5 REPLIES 5
Astounding
PROC Star

Search for them one at a time:

 

data want;

set have;

decide = find(longstring, 'decide', , 'i') > 0;

sponsor = find(longstring, 'sponsor', , 'i') > 0;

withdraw = find(longstring, 'withdraw', , 'i') > 0;

run;

 

This gives you three variables, each either 0 or 1, indicating whether the string was found.  The 'i' modifier says to ignore upper case vs. lower case differences.

 

It is possible to consider FINDW instead of FIND, but for your purposes it looks like FIND is better.  So the 0/1 value for the variable WITHDRAW indicates the presence of any of these strings:  withdraw, withdraws, withdrawn.  It does not locate "withdrew" however.  So create as many flags as are needed ... perhaps a separate one for "decision".

 

Reeza
Super User

It seems like you've been asking a few questions related to REGEX recently, so I thought this may be a useful reference:

https://regex101.com/

 

You can use it to build and test your strings. 

 

 

 

 

stat_sas
Ammonite | Level 13

Hi,

 

Below syntax would give the number of key words appearing in each observation. This can be modified to create 0/1 flag variables for each of the mentioned keyword.

 

data want(drop=list);
set have;
num_keywords=0;
length list $50;
do list = 'decision', 'decide','sponsor', 'withdrawn';
      if find(trim(text), trim(list),'i') > 0 then num_keywords+1;
end;
run;

SASPhile
Quartz | Level 8

the words in the list have to appear the way they are declared? what if the order changes?

ChrisNZ
Tourmaline | Level 20

1. What should the output look like?

2. RegEx bring no benefit for such a simple search. Consider using index() or find() as shown

  Fyi, the match string would be something like:


data HAVE;
input TEXT $50.;
MATCH=prxmatch('m/deci(sion|de)|sponsor|withdrawn/i',TEXT);
cards;
sponsor withdrawn
withdrawn by sponsor
decision taken by sponsor to withdraw
sponsor decided to withdraw
run;

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 881 views
  • 0 likes
  • 5 in conversation