BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
_Rachael_
Calcite | Level 5

Hi,

 

I am writing a SAS macro that is mostly working as I expect it to, however, I am having trouble getting the find() function to work.

 

Currently my macro looks something like this simplified version in which I am trying to find the text strings contained in &ssub. in the variable addr.

%macro addrclass;
%do i = 1 to &max.;

/*Some code that is working and not related to the find function goes here and creates some new macro variables such as &ssub.*/

%put &i. &ssub.;

data class;
set class;
%if %sysfunc(find(addr,'&ssub.'))>0 %then %do;
flag=1;
%end;
%else %do;
flag=0;
%end;
run;

%end;
%mend;

I don't get any errors in my log with this code but it always results in flag=0. Even when I manually change the find function to search for a non macro value, it doesn't seem to evaluate.

E.g. the following code would produce the same results

%sysfunc(find("This string contains a dog","dog"))>0

Any help would be greatly appreciated!!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Beginner misunderstanding of the macro preprocessor. Happens to many.

The macro preprocessor resolves the %sysfunc before the data step is even compiled, so it only "sees" the text addr, not the contents of data step variable addr. The condition will always be false, so only the flag = 0 statement will be created by the macro code and handed to the data step compiler.

You want to work with data, so use data step code:

flag = find(addr,"&ssub.") > 0;

as the results of a logical operation are either 0 (false) or 1 (true).

View solution in original post

8 REPLIES 8
Patrick
Opal | Level 21

Macro level is text based. Remove the quotes around the search strings. 

 

And on a side note: A macro variable in single quotes doesn't resolve so here you've got it doubled wrong.

Kurt_Bremser
Super User

Beginner misunderstanding of the macro preprocessor. Happens to many.

The macro preprocessor resolves the %sysfunc before the data step is even compiled, so it only "sees" the text addr, not the contents of data step variable addr. The condition will always be false, so only the flag = 0 statement will be created by the macro code and handed to the data step compiler.

You want to work with data, so use data step code:

flag = find(addr,"&ssub.") > 0;

as the results of a logical operation are either 0 (false) or 1 (true).

_Rachael_
Calcite | Level 5

Thanks Kurt! That certainly was the problem and got the find() function working for me. 

However, my loop is not iterating as I expected (perhaps another misunderstanding of the macro language on my part).

When I run the code, it seems to only flag the &ssub. values from the final iteration of the do loop. I have removed the line that sets the flag to zero if not found but that doesn't seem to have resolved the issue.

 

*UPDATED* - I realised that my loop was adding blank spaces to the &ssub. values that were shorter than the max length value (which happened to also be the last value in the loop). Using the trim function appears to have resolved it.

 

Thanks.

Kurt_Bremser
Super User

@_Rachael_ wrote:

Thanks Kurt! That certainly was the problem and got the find() function working for me. 

However, my loop is not iterating as I expected (perhaps another misunderstanding of the macro language on my part).

When I run the code, it seems to only flag the &ssub. values from the final iteration of the do loop. I have removed the line that sets the flag to zero if not found but that doesn't seem to have resolved the issue.

 

*UPDATED* - I realised that my loop was adding blank spaces to the &ssub. values that were shorter than the max length value (which happened to also be the last value in the loop). Using the trim function appears to have resolved it.

 

Thanks.


See Maxim 46.

Astounding
PROC Star

Your macro is using far too much macro language.  Macro statements do not process values found in DATA step variables.

 

Combining that with the other suggestions you have received, this would be a working version:

 

%macro addrclass;
%do i = 1 to &max.;

/*Some code that is working and not related to the find function goes here and creates some new macro variables such as &ssub.*/

%put &i. &ssub.;

data class;
  set class;
  if find(addr, "&ssub."))>0 then do;
     flag=1;
  end;
  else flag=0;
run;

%end;
%mend;

Depending on what processing takes place in the early steps that you didn't post, it might be possible to save a list of the &SSUB values and go through 1 final DATA step to set values for FLAG.  Right now, there are multiple DATA steps, one for each value of &SSUB.

_Rachael_
Calcite | Level 5

Thanks Astounding! I certainly had over-cooked this one.

 

I'm interested in your comment that there may be a more efficient way to approach this problem without running multiple data steps.

 

What I actually have is a long dataset with an id variable (hrno), three variables which are search terms (ssub1 ssub2 ssub3) and a fourth variable which is a classifier (nht). In a second dataset, I am searching the field addr for either ssub1 or (ssub2 and ssub3). If I identify that a row in the second dataset satisfies a search term, I want to update the category field with the value of the classifier. 

 

Originally I was coding all the search terms into a datastep manually but decided this would be much better approached using a macro. I haven't had a large amount of experience with macros so this was all I could come up with. I hadn't written the whole macro yet but was just testing to make sure I could get the macro loop to work with one search term before continuing. 

 

As I mentioned in my reply to Kurt, the loop doesn't seem to be iterating as I expected anyway so definitely open to other suggestions. Thanks very much!!

 

So far my code is:

%macro addrclass;

%do i = 1 %to &max.;

proc sql noprint;
select ssub1var, ssub2var, ssub3var, nhtvar
	into :ssub1, :ssub2, :ssub3, :snht
from hrlist
where hrno=&i;
quit;

%put &i. &ssub1. &ssub2. &ssub3. &snht.;

data class;
set class;
	if find(addr,"&ssub1.")>0 then flag=1;
run;

%end;


%mend;
Tom
Super User Tom
Super User

You probably want to see if you can figure out how to load your "classifier" dataset into a hash object instead.

 

You use the macro processor to generate code. So to figure out how to generate your code please show an example of the working code without and macro statements.  Then you can start trying to generate the code using macro code (or some other method of code generation).

 

Sounds like you originally had a program like:

data want;
  set have;
  if find(addr,"X1")  or find(addr,"Y1") or find(addr,"Z1") then category =1;
  else  if find(addr,"X2")  or find(addr,"Y2") or find(addr,"Z2") then category =2;
...
run;

And you created a dataset, let's call it CLASSIFIERS, that looks like:

SSUB1 SSUB2 SSUB3  CLASSIFIER
X1 Y1 Z1 1
X2 Y2 Z2 2
...

So a reasonable goal should be to generate the same code.  Personally I always find this type of "wallpaper" code generation a lot easier using PUT statement instead of macro code. Having to move values from data into macro variables and then adding all the complexities of macro variable references just adds confusion.

 

So for example here is a way to recreate the series of IF/THEN ELSE IF/THEN code by using the CLASSIFIERS dataset.  

filename code temp;
data _null_;
  set classifiers ;
  file code ;
  if _n_ > 1 then put 'else ' @;
  put 'if find(addr,' SSUB1 :$quote.
      ') or find(addr,' SSUB2 :$quote. 
      ') or find(addr,' SSUB3 :$quote. 
      ') then category=' CLASSIFIER ';'
  ;
run;

 Then you can use %INCLUDE statement to include that code into the right place in your data step that reads the data you want to classify.

data want;
  set have;
  %include code;
run;

So no need for macro programming (other than the %INCLUDE statement).

 

Now if your want to generalize this were the names of the datasets or variables change for different uses then you might want to make a macro.  But it will be a simple case of replacing hard coded names with macro variable references.

Astounding
PROC Star

The general idea is this.  

 

The macro loops through &MAX times, each time selecting another item to search for and then running a DATA step.

 

It ought to be possible to reduce this to two steps.  So if &MAX is large this is a large savings, but from this side it's not possible to determine that.

 

The first step in the new design gathers ALL the terms you need to search for from HRLIST.  It's probably not a job for SQL, but rather a job for a single DATA step.

 

Then the second step would be a DATA step that reads in the search terms, and sets up FLAG accordingly.

 

That's the general idea.  It's difficult to say more without knowing more about your data.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 8847 views
  • 2 likes
  • 5 in conversation