Solved
Contributor
Posts: 68

# List all obs in a data set where more than 2 words is in a variable?

Hi guys,

By using findw function, only one word can be found.

FINDW(string, word, chars, modifiers <, startpos>)

what function can help me find more than 2 words?

Accepted Solutions
Solution
‎09-17-2014 10:16 AM
Super User
Posts: 8,127

## Re: List all obs in a data set where more than 2 words is in a variable?

Just use normal boolean logic.

data want ;

set have ;

if indexw(upcase(comment),'PATIENT')

and indexw(upcase(comment,'ANTIBIOTICS')

;

run;

All Replies
Posts: 4,741

## Re: List all obs in a data set where more than 2 words is in a variable?

countw() if it's just about finding strings with more than 2 words.

find() if you're looking for a sub-string made up of 2 specific words.

A regular expression using prxmatch() for more complex text patterns.

Contributor
Posts: 68

## Re: List all obs in a data set where more than 2 words is in a variable?

Thank you for letting me know prxmatch(). It do search certain Patterns, but it seems can not search two substrings according to my learning just acquired from SAS(R) 9.3 Functions and CALL Routines: Reference

Super User
Posts: 8,127

## Re: List all obs in a data set where more than 2 words is in a variable?

Depends what you mean.  If you want to search for a "word" that contains a space then it works just fine.

3587  data have ;

3588    str='ONE TWO THREE FOUR';

3589    x=findw(str,'TWO THREE');

3590    put x=;

3591  run;

x=5

Super User
Posts: 9,599

## Re: List all obs in a data set where more than 2 words is in a variable?

If its just two words separated by a space then my suggestion would be index, tiny fraction faster and less code:

data have ;

str='ONE TWO THREE FOUR';

x=index(str,' ');

put x=;

run;

Contributor
Posts: 68

## Re: List all obs in a data set where more than 2 words is in a variable?

Sorry, I didn't make my question clear.

I'm not looking for a substring with 2 words, but 2 or more substrings with 1 or more words.

For example, I want to search obs which contain the word of Patient and antibiotics in the variable Comment and obs which contain the word of patient and high stress in the variable Comment. Which function should I use?

Thank you so much!

comment

 Patient has had a persistent cough for 3 weeks Patient placed on beta-blockers on 7/1/2006 Patient has been on antibiotics for 10 days Patient advised to lose some weight This patient is always under high stress Refer this patient to mental health for evaluation
Super User
Posts: 9,599

## Re: List all obs in a data set where more than 2 words is in a variable?

Well, you can try index:

if index(comment,"Patient")>0 and index(comment,"antibiotics")>0 then ...

You will find it cumbersome if you have lots of different matches to make though.  You could also use regular expressions to the same effect.

You could look at putting your matches in a datastep then generating the code:

data matches;

matcha="Patient"; matchb="antibiotics"; output;

run;

data _null_;

set matches end=last;

if _n_=1 then call execute('data want; set have;');

call execute(' if index(comment,"'||strip(matcha)||'")>0 and index(comment,"'||strip(matchb)||'" then found=1;');

if last then call execute(';run;');

run;

Contributor
Posts: 68

## Re: List all obs in a data set where more than 2 words is in a variable?

Sorry, I didn't make my question clear.

I'm not looking for a substring with 2 words, but 2 or more substrings with 1 or more words.

For example, I want to search obs which contain the word of Patient and antibiotics in the variable Comment and obs which contain the word of patient and high stress in the variable Comment. Which function should I use?

Thank you so much!

comment

 1 2 3 4 5 6 001 Mayo Clinic 10/21/2006 120 78 7 Patient has had a persistent cough for 3 weeks 003 HMC 09/01/2006 166 58 8 Patient placed on beta-blockers on 7/1/2006 002 Mayo Clinic 10/01/2006 210 68 9 Patient has been on antibiotics for 10 days 004 HMC 11/11/2006 288 88 9 Patient advised to lose some weight 007 Mayo Clinic 05/01/2006 180 54 7 This patient is always under high stress 050 HMC 07/06/2006 199 60 123 Refer this patient to mental health for evaluation
Solution
‎09-17-2014 10:16 AM
Super User
Posts: 8,127

## Re: List all obs in a data set where more than 2 words is in a variable?

Just use normal boolean logic.

data want ;

set have ;

if indexw(upcase(comment),'PATIENT')

and indexw(upcase(comment,'ANTIBIOTICS')

;

run;

🔒 This topic is solved and locked.