BookmarkSubscribeRSS Feed
R_Auger
Fluorite | Level 6

Hi

 

I am trying to find all instances when two words are present in a text field.

In MS SQL  I can use CONTAINS(t1.NOTES, 'NEAR((Apples, Red),3)'), meaning the words Apples and Red are located near each other.

When I try this in SAS EG 71. I get ERROR: Function CONTAINS could not be located.

I also tried prxmatch("!(Apples|Red)!i", t1.NOTES)> 0, however either word shows up in the results.

So if we have 
1 - Apple Granny Green

2 - Apple Spartan Red

3 - Apple Jazz Red

4 - Apple Yellow Delicious

#2 and #3 are valid

 

I am doing this in Proc SQL.

 

Thanks 

 

2 REPLIES 2
AMSAS
SAS Super FREQ

Here's one way to do it: 
First data step sets up your test data
Then 2 examples using data step and PROC SQL

 

data got ;
	infile cards ;
	input string $30. ;
cards;
Apple Granny Green
Apple Spartan Red
Apple Jazz Red
Apple Yellow Delicious
Oranges and Pears
;

data want ;
	set got ;
	x=prxmatch('/Red/',string) ;
	y=prxmatch('/Apple/',string) ;
	if x and y then 
		put "Red Apple - " string ;
	else
	if x then
		put "Red Only - " string ;
	else
	if y then
		put "Apple Only - " string ;
	else 
		put "None - " string ;

run ;

proc sql ;
select *
from got
where prxmatch('/Red/',string) and prxmatch('/Apple/',string)
;
quit ;
FreelanceReinh
Jade | Level 19

Hi @R_Auger,

 

The 'e' modifier of the FINDW function in conjunction with absolute differences of word positions could be useful to implement something similar to the proximity term NEAR in MS SQL.

 

Example:

data have;
input n text $50.;
cards;
1 Apple Granny Green
2 Apple Spartan Red
3 Apple Jazz Red
4 Apple Yellow Delicious
5 apple cool Jazz Juicy Dark Red Fruit
6 Apple cool Juicy Dark red Fruit
7 big red super fresh apple
8 red really red super fresh apple
;

proc sql;
create table want(drop=_:) as
select *, findw(text,'apple', ' ', 'ei') as _p1,
          findw(text,'red',   ' ', 'ei') as _p2
from have
having _p1 & _p2 & abs(_p1-_p2)<5;
quit;

Result:

n    text

2    Apple Spartan Red
3    Apple Jazz Red
6    Apple cool Juicy Dark red Fruit
7    big red super fresh apple

Item 8 shows the limitation of FINDW, though: It finds only the first occurrence of a word so that duplicates could be a problem. Here, _p1=6 and _p2=1 violate the proximity condition.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 972 views
  • 0 likes
  • 3 in conversation