BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JoanneR
Calcite | Level 5

Hi everyone,

 

I'm trying to use FINDW to find a list of words in string variables. Since I have a long list of words that were originally taken from Excel, I put them all into a macro variable. So far I have the syntax below, but each time I run it, I get an error message.

 

I have a feeling it has to do with the fact some of the words in my list have spaces, parentheses and dashes, but I'm not sure how to account for that in my code.

 

Any help is greatly appreciated!

data test;
set data;
	if findw(lowcase(var1),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var2),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var3),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var4),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var5),&alldrugs, '|','I S') >0 	
	then drug='Yes' ;
	else drug='No'; 
run;

 

Capture.PNG

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

or this?

 

%let alldrugs=orchids|lillies|roses|daisy;
%let ndrugs=%sysfunc(countw(&alldrugs,|));
data customer;
	input var1 $100.;
	cards;
The house was surrounded with various objects. Some objects being boxes, shoes, and roses.
The house was surrounded with various objects. Some objects being boxes, lillies, and frogs.
The house was surrounded with various objects. Some objects being boxes, oranges and frogs.
;
data a;
	array index_drug $ index_drug1-index_drug4;
	array drug_word $ drug_word1-drug_word&ndrugs;
	set customer;
	cust_no=_n_;
	do i=1 to &ndrugs;
	  drug_word(i)=scan("&alldrugs",i,"|");							
	  index_drug(i)=findw(lowcase(var1),compress(lowcase(drug_word(i))));
	end;
	if sum(of index_drug:)>0 then index=1; else index=0;
	drop var1 i drug_word:;
run;
--
Paige Miller

View solution in original post

19 REPLIES 19
novinosrin
Tourmaline | Level 20
data test;
set data;
	if findw(lowcase(var1),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var2),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var3),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var4),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var5),&alldrugs, '|','I S') >0 /*or	this is the error*/
	then drug='Yes' ;
	else drug='No'; 
run;
or then drug='Yes' ; /*is invalid*/

 

JoanneR
Calcite | Level 5

Sorry, that's a mistake in the example. In the original code there isn't an or at the end. I've corrected the question.

PaigeMiller
Diamond | Level 26

Can you show us a portion of &alldrugs?

 

One correction is that &alldrugs must be in double quotes

 

if findw(lowcase(var1),"&alldrugs", '|','I S') >0 or 

etc.

 

If that's not it, then please paste the entire SAS log, including the code and the error message (not just the error message) as text and not as a screen capture, by clicking on the {i} icon and pasting the text into the window that appears. Do not paste the text into your message any other way.

--
Paige Miller
JoanneR
Calcite | Level 5

Good point! I forgot to add those in. It removes the error message, but I still end up with no observations for drug='Yes' when there should be.

 

In terms of the &alldrugs variable, it was created with the following syntax:

 

proc sql;
select drug_name into: alldrugs separated by ' '
from druglist;
quit;
%put &alldrugs;

 

and gave the drugs output into a table. Below are 4 of the drugs listed from the top.

1-(3-CHLOROPHENYL)PIPERAZINE
1-(5-METHOXY-1H-INDOL-3-YL)-2-PROPANAMINE
10-HYDROXYCARBAZEPINE
10-OH CARBAZEPINE

 

I also should mention, some of the drugs have comma's so they are written as 1,3-dimethoxy. Which is why I have the data as pipe delimited.

 

Thanks for the help!

Tom
Super User Tom
Super User

Are expecting one value of VAR1 to have ALL of those drug names IN THAT EXACT ORDER?

Or did you want to test if the value of VAR1 is in that list of drug names?

Did you mean:

findw("&alldrug",var1,....)

@JoanneR wrote:

Good point! I forgot to add those in. It removes the error message, but I still end up with no observations for drug='Yes' when there should be.

 

In terms of the &alldrugs variable, it was created with the following syntax:

 

proc sql;
select drug_name into: alldrugs separated by ' '
from druglist;
quit;
%put &alldrugs;

 

and gave the drugs output into a table. Below are 4 of the drugs listed from the top.

1-(3-CHLOROPHENYL)PIPERAZINE
1-(5-METHOXY-1H-INDOL-3-YL)-2-PROPANAMINE
10-HYDROXYCARBAZEPINE
10-OH CARBAZEPINE

 

Thanks for the help!




PaigeMiller
Diamond | Level 26

@JoanneR wrote:

 

and gave the drugs output into a table. Below are 4 of the drugs listed from the top.

1-(3-CHLOROPHENYL)PIPERAZINE
1-(5-METHOXY-1H-INDOL-3-YL)-2-PROPANAMINE
10-HYDROXYCARBAZEPINE
10-OH CARBAZEPINE


These drugs are upper case. Your original code tested if lowcase(var1) etc. contains these drugs, and you have a case mismatch, you will never get "Yes" until you fix that.

--
Paige Miller
JoanneR
Calcite | Level 5

I tried changing that to make both lower case using the two following forms, but both don't seem to work.

data test;
set data;
	if findw(lowcase(var1),lowcase("&alldrugs"),' ', 'I S') >0 
then drug= 'Yes'; 
else drug='No'; 
run;

data test;
set data;
	if findw(lowcase(var1),compress(lowcase("&alldrugs")),' ', 'I S') >0 
then drug= 'Yes'; 
else drug='No'; 
run;

Thank you!

PaigeMiller
Diamond | Level 26

As @ballardw said, we need to see your data now — a portion of it where you expect to get a match with one of the drugs in &alldrugs.

--
Paige Miller
ballardw
Super User

@JoanneR wrote:

Hi everyone,

 

I'm trying to use FINDW to find a list of words in string variables. Since I have a long list of words that were originally taken from Excel, I put them all into a macro variable. So far I have the syntax below, but each time I run it, I get an error message.

 

I have a feeling it has to do with the fact some of the words in my list have spaces, parentheses and dashes, but I'm not sure how to account for that in my code.

 

Any help is greatly appreciated!

data test;
set data;
	if findw(lowcase(var1),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var2),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var3),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var4),&alldrugs, '|','I S') >0 or 
	   findw(lowcase(var5),&alldrugs, '|','I S') >0 	
	then drug='Yes' ;
	else drug='No'; 
run;

 

Capture.PNG

 


And your example data containing VAR1, Var2, Var3 etc is where?

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

And you should provide the actual value of that macro variable.

 

I suspect the basic problem is order of the variables. The first argument of FINDW is the value to search in and the second parameter is what to search for. Does that sound like what you are doing?

See this small example:

data example;
   longstring= "this is a string with several words";
   x ="string with";
   y= findw(longstring,x);
   z= findw(x,longstr);
run;

Your FINDW code I think looks more like the Z version above.

 

JoanneR
Calcite | Level 5

Due to the sensitivity of this data I cannot provide further info. Instead, I will provide an example to better explain my question. Lets say I have an Excel spreadsheet with a list of fake flowers

 

Flowers:

1. Orchids

2. Lilies

3. Roses

4. 1,4,5-Daisy

5. Dahlia

6. 3,4-Lilac(2-Lilies)Daisy

7. Common lilac

8. False goats beard

 

Say these flowers were placed into &alldrugs. Now in my data I have an observation with the following report under var1:

"The house was surrounded with various objects. Some objects being boxes, shoes, and roses."

 

I want sas to read through that statement and see if the keywords from &alldrugs are mentioned in that statement. If not then to say "No". 

 

My syntax was originally this, but it takes too long to run and I would like it to be more efficient by using the syntax I provided in the original question.

	drug=1;
	i=1;
	do while ((scan("&alldrugs",i,"|") ne '') and drug>0);		
	  i+1;																/*Number of the drug found on the keyword list -1 */
	  drug_word=scan("&alldrugs",i-1,"|");							
	  indx_drug1=findw(lowcase(var1),compress(lowcase(drug_word)));
	  indx_drug2=findw(lowcase(var2),compress(lowcase(drug_word)));
	  indx_drug3=findw(lowcase(var3),compress(lowcase(drug_word)));
	  indx_drug4=findw(lowcase(var4),compress(lowcase(drug_word)));
	  indx_drug5=findw(lowcase(var5),compress(lowcase(drugd_word)));
	  indx_drug6=findw(lowcase(var6),compress(lowcase(drug_word)));
	  indx_drug7=findw(lowcase(var7),compress(lowcase(drug_word)));

if indx_drug1>0 or indx_drug2>0 or .... indx_drug7>0 then drug=1;
else drug=drug+1;
PaigeMiller
Diamond | Level 26

It's fine if you give us a made up example, because real data is sensitive.

 

This works for me.

 

%let alldrugs=orchids|lillies|roses|daisy;

data a;
	var1="The house was surrounded with various objects. Some objects being boxes, shoes, and roses.";
	i=1;
	do while (scan("&alldrugs",i,"|") ne '');		
	  i+1;																/*Number of the drug found on the keyword list -1 */
	  drug_word=scan("&alldrugs",i-1,"|");							
	  indx_drug1=findw(lowcase(var1),compress(lowcase(drug_word)));
	  output;
	end;
run;
--
Paige Miller
JoanneR
Calcite | Level 5

Do you know if there's a way to run that without having to write everything out? The keylist has about 200 words, and the variables have around 50k observations with a different response on each.

 

Thank you! I really appreciate all the input that's been provided.

PaigeMiller
Diamond | Level 26

Is this what you mean?

 

data a;
	array index_drug $ index_drug1-index_drug4;
	array drug_word $ drug_word1-drug_word4;
	var1="The house was surrounded with various objects. Some objects being boxes, shoes, and roses.";
	do i=1 to countw("&alldrugs",'|');
	  drug_word(i)=scan("&alldrugs",i,"|");							
	  index_drug(i)=findw(lowcase(var1),compress(lowcase(drug_word(i))));
	end;
	if sum(of index_drug:)>0 then index=1; else index=0;
	drop var1 i;
run;

You can drop additional variables from the data set if desired.

--
Paige Miller
PaigeMiller
Diamond | Level 26

or this?

 

%let alldrugs=orchids|lillies|roses|daisy;
%let ndrugs=%sysfunc(countw(&alldrugs,|));
data customer;
	input var1 $100.;
	cards;
The house was surrounded with various objects. Some objects being boxes, shoes, and roses.
The house was surrounded with various objects. Some objects being boxes, lillies, and frogs.
The house was surrounded with various objects. Some objects being boxes, oranges and frogs.
;
data a;
	array index_drug $ index_drug1-index_drug4;
	array drug_word $ drug_word1-drug_word&ndrugs;
	set customer;
	cust_no=_n_;
	do i=1 to &ndrugs;
	  drug_word(i)=scan("&alldrugs",i,"|");							
	  index_drug(i)=findw(lowcase(var1),compress(lowcase(drug_word(i))));
	end;
	if sum(of index_drug:)>0 then index=1; else index=0;
	drop var1 i drug_word:;
run;
--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 19 replies
  • 3876 views
  • 2 likes
  • 5 in conversation