DATA Step, Macro, Functions and more

Searching if a phrase exists between two variables

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Searching if a phrase exists between two variables

Hi ,

 

I am working on a dataset where I am supposed to compare two  variables to check if a phrase exists in one of the variables or not. I tried using index function and find function .Both the functions work if there a one-to-one match , I want SAS to tell even if there is that particular phrase in the variable .Ie , I want to check from my below example , variable A phrase is present in Variable b :  Below is the example of what I want :

 

abINDEX Function ResultExpected Result 
Left ICALeft ICA1Match
Left Middle cere - M1Left Middle cere - M11Match
Right VerteRight Verte,Right Verte, mid cervi0Match
Right Middle cere - M1,Right Supra ICARight Middle cere - M10Match
Midline Basilar trunkMidline Basilar trunk,Right Verte0Match
Right VerteLeft Verte0No Matching Data
Left Middle cere - M1Left Verte0No Matching Data

 

Is there an other SAS function or method can I use?

 

I hope I phrased my question correctly,

 

Thanks In Advance !!!

 


Accepted Solutions
Solution
‎09-15-2017 01:45 PM
Super Contributor
Posts: 440

Re: Searching if a phrase exists between two variables

i think this gives you the expected output - it searches both strings for the presence of the other

 

data have;
	length str1 $100 str2 $100;
	infile datalines dlm="~";
	input str1 $ str2 $;
	datalines;
Left ICA~Left ICA
Left Middle cere - M1~Left Middle cere - M1
Right Verte~Right Verte,Right Verte, mid cervi
Right Middle cere - M1,Right Supra ICA~Right Middle cere - M1
Midline Basilar trunk~Midline Basilar trunk,Right Verte
Right Verte~Left Verte
Left Middle cere - M1~Left Verte
;
run;

data want;
	set have;
	if find(strip(str2),strip(str1)) or find(strip(str1),strip(str2)) then flag=1;
	else flag=0;
run;

View solution in original post


All Replies
PROC Star
Posts: 283

Re: Searching if a phrase exists between two variables

/*untested*/

Try eqt operator

 

proc sql;

create table want as

select *

from have

where a eqt b;

quit;

New Contributor
Posts: 3

Re: Searching if a phrase exists between two variables

Posted in reply to novinosrin
Thanks for the reply ! , But it does not work. Smiley Sad, Yet I am glad to learn that there is an 'EQT' operator in PROC SQL
Super User
Posts: 19,815

Re: Searching if a phrase exists between two variables

Can you show your code?

 

Did you use any of the modifiers to ignore case? 

 

And I'm assuming you're using FINDW and INDEXW even though you say FIND/INDEX. 

Solution
‎09-15-2017 01:45 PM
Super Contributor
Posts: 440

Re: Searching if a phrase exists between two variables

i think this gives you the expected output - it searches both strings for the presence of the other

 

data have;
	length str1 $100 str2 $100;
	infile datalines dlm="~";
	input str1 $ str2 $;
	datalines;
Left ICA~Left ICA
Left Middle cere - M1~Left Middle cere - M1
Right Verte~Right Verte,Right Verte, mid cervi
Right Middle cere - M1,Right Supra ICA~Right Middle cere - M1
Midline Basilar trunk~Midline Basilar trunk,Right Verte
Right Verte~Left Verte
Left Middle cere - M1~Left Verte
;
run;

data want;
	set have;
	if find(strip(str2),strip(str1)) or find(strip(str1),strip(str2)) then flag=1;
	else flag=0;
run;
Super User
Posts: 5,509

Re: Searching if a phrase exists between two variables

Posted in reply to ChrisBrooks

Good idea.  One more thing to consider:  if upper vs. lower case should still be a match, and the 'i' modifier:

 

	if find(strip(str2),strip(str1), 'i') or find(strip(str1),strip(str2), 'i') then flag=1;
Frequent Contributor
Posts: 100

Re: Searching if a phrase exists between two variables

Try this:

data WANT;
  set HAVE;

MATCH=0;

NEW_A=compbl(tranwrd(tranwrd(A,",","-"),"-"," "));
NEW_B=compbl(tranwrd(tranwrd(B,",","-"),"-"," "));

if NEW_A=NEW_B then MATCH=1;

if MATCH=0 then do;
  COUNT_WORDS_NEW_A=countw(right(NEW_A),"");
  COUNT_WORDS_NEW_B=countw(right(NEW_B),"");

if COUNT_WORDS_NEW_A<=COUNT_WORDS_NEW_B then
  MATCH = NEW_A=substr(NEW_B,1,length(NEW_A);
else if COUNT_WORDS_NEW_B < COUNT_WORDS_NEW_A then
  MATCH = NEW_B=substr(NEW_A,1,length(NEW_B);
end;
run;

*Do google the SAS functions i used such as TRANWRD, COMPBL, COUNTW, SUBSTR, LENGTH.;

Hope it helps.

New Contributor
Posts: 3

Re: Searching if a phrase exists between two variables

Thanks All !!!  This was very helpful !!!!!!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 173 views
  • 0 likes
  • 6 in conversation