DATA Step, Macro, Functions and more

What is my macro missing?

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

What is my macro missing?

Hey Guys,

So I've looked through some basic macro info but I still can't get this macro running.  Can anyone see why it doesn't work?

Here's some working code that's not in the form of a macro:

data stackedtest1;

          set stacked_3;

                              if find(URL,'cm=') then do;

                                        x=find(URL, 'cm=')+ lengthc('cm=');

                                        cmOutcome1 = substr(URL,x,300);

                                                  if find(cmOutcome1, '&') then do;

                                                            cmOutcome = scan(cmOutcome1,1,'&');

                                                  end;

                                                  else if find(cmOutcome1, '?') then do;

                                                            cmOutcome = scan(cmOutcome1,1,'?');

                                                  end;

                                                  else if find(cmOutcome1, '#') then do;

                                                            cmOutcome = scan(cmOutcome1,1,'#');

                                                  end;

                                                  else if find(cmOutcome1, "/") then do;

                                                            cmOutcome = scan(cmOutcome1,1,"/");

                                                  end;

                                                  else cmOutcome = cmOutcome1;

                              end;

run;

And here's the macro:

%macro FindParseAfter(VarName, WhatItsLookingFor);

                              %if find(&VarName,&WhatItsLookingFor) %then %do;

                                        length = find(&VarName, &WhatItsLookingFor)+lengthc(&WhatItsLookingFor);

                                        cmOutcome1 = substr(&VarName,length,300);

                                                  %if find(cmOutcome1, '&') %then %do;

                                                            cmOutcome = scan(cmOutcome1,1,'&');

                                                  %end;

                                                  %else %if find(cmOutcome1, '?') %then %do;

                                                            cmOutcome = scan(cmOutcome1,1,'?');

                                                  %end;

                                                  %else %if find(cmOutcome1, '#') %then %do;

                                                            cmOutcome = scan(cmOutcome1,1,'#');

                                                  %end;

                                                  %else %if find(cmOutcome1, '/') %then %do;

                                                            cmOutcome = scan(cmOutcome1,1,'/');

                                                  %end;

                                                  %else cmOutcome = cmOutcome1;

                              %end;

%mend;

I think I need to either run the macro in a data step or add that to the macro.  But I can't get it running in a data step or without....

Here's the error I get:

563  data stacked_1 ;

564      set stacked_3;

565

566      %FindParseAfter(URL, 'cm=');

ERROR: Required operator not found in expression: find(&VarName,&WhatItsLookingFor)

ERROR: The macro FINDPARSEAFTER will stop executing.

567

568

569  run;


Accepted Solutions
Solution
‎06-29-2011 03:05 PM
Frequent Contributor
Posts: 104

What is my macro missing?

I'm not sure why the working data step code needs to be converted to a macro.  Data step execution is within a compiled data step environment, whereas macro execution is within the global SAS session, typically to generate code, not to actually process data.  Although you could, if you try hard enough, to make macros process data without using a data step.

Data step and Macro facility serve different entirely purposes, but can be confusing as one needs to keep track what is a data step functionality (executed at data step run time), and what is macro functionality (executed at macro run time).  One thing helpful to keep in mind is that all macro statements and functions always start with a % sign.  If it hasn't got a % in front of it, it's data step.

The error message above is because "find" - without the % - is not a macro function (it is a data step function).  Thus SAS is complaining that after %if it can't find a valid macro conditional expression. Also, the macro function %find does not exist, so you can't simply use %if %find(s1,s2) .... But SAS always has a work around (gotta love these guys).

The fix for this syntax issue is to use %sysfunc( find( &VarName, &WhatItsLookingFor ) )>0, which is a valid macro conditional.  It lets the macro use the data step function at macro run time.

Just fixing the syntax of this particular expression though is not sufficient for your purposes.

View solution in original post


All Replies
Solution
‎06-29-2011 03:05 PM
Frequent Contributor
Posts: 104

What is my macro missing?

I'm not sure why the working data step code needs to be converted to a macro.  Data step execution is within a compiled data step environment, whereas macro execution is within the global SAS session, typically to generate code, not to actually process data.  Although you could, if you try hard enough, to make macros process data without using a data step.

Data step and Macro facility serve different entirely purposes, but can be confusing as one needs to keep track what is a data step functionality (executed at data step run time), and what is macro functionality (executed at macro run time).  One thing helpful to keep in mind is that all macro statements and functions always start with a % sign.  If it hasn't got a % in front of it, it's data step.

The error message above is because "find" - without the % - is not a macro function (it is a data step function).  Thus SAS is complaining that after %if it can't find a valid macro conditional expression. Also, the macro function %find does not exist, so you can't simply use %if %find(s1,s2) .... But SAS always has a work around (gotta love these guys).

The fix for this syntax issue is to use %sysfunc( find( &VarName, &WhatItsLookingFor ) )>0, which is a valid macro conditional.  It lets the macro use the data step function at macro run time.

Just fixing the syntax of this particular expression though is not sufficient for your purposes.

Contributor
Posts: 23

What is my macro missing?

"I'm not sure why the working data step code needs to be converted to a macro"

If your asking why I'm 'macrotizing' a data step, it's because I have a data processing program that has this set of code written many, many times.  And it's made viewing the program somewhat difficult.  It would clean up my program a lot if I could write a macro to simplify things.

Plus, I'd like to get the macro to the point where it looks something like this:  

FindParse(&VarName, &Where2Start, &NumChar2SkipBeg=lengthc(&Where2Start), &Where2Stop, NumChar2KeepEnd=0)

Then this macro would search a string (usually a URL) for a string (Where2Start), then it would return everything after up to a specified string (Where2Stop), while also allowing to either keep the string found in the beginning or the end.  And the default for where to start is after the string that was found.  And the default for where to cut/stop is just before the STOP string.

For instance, lets say I search Google for TV's.  Then I decide to view the TVs on Sears' website.  The URL when I first land on sears.com would look like:

http://www.sears.com/shc/s/s_10153_12605_Computers+%26+Electronics_Televisions_Flat+Panel?viewItems=...

Sears might then save this URL in a database so they can get information out of it.  One thing of interest would be the section that says "keyword=television".  Because Sears wants to know what keywords drive traffic to their page.  So in order to create a "Keyword" variable in SAS where "television" is one of its data, I would need to parse this out.  I don't know all the possible keywords people use so it's best to just take out what's listed there.

That's what this macro could do for me.

FindParse(URL, 'keyword', '&')

Which would return:  television

All I'd have to write is the above line.  1 line.  That's it.  That's so much cleaner, and it's much faster when writing new code in the long run.

It'd also reduce the total number of lines by SOOOO much.  For instance.  I could parse the URL and create new variables in SAS in 1 data step that's short, concise and simple.

Data Step2;

     set Step1;

Keyword = FindParse(URL, 'keyword', '&')

Sort = FindParse(URL, 'sortOption', '&')

AutoRedirect = FindParse(URL, 'autoRedirect', '&')

SID = FindParse(URL, 'sid', '&')

PSID = FindParse(URL, 'psid', '')

run;

The above set of code is much, much easier to read/understand as well as drastically reduces the number of lines in the overall program.  Other people need to be able to read my program and this makes it 10x's easier to read.

This macro would be very helpful in many different situations for me.

I'm going to try to do the %sysfunc to see if I can get it to work.  Hopefully I can.

I've also been thinking of using SQL code.  I don't know any SQL but I think it might work better.  What do you think?

Contributor
Posts: 23

What is my macro missing?

So to update this, if I created a macro variable outside of this macro, called EndChars, that was a list of all possible characters that signify a stop, then set it as the default for Where2Stop, the above data step would be even more simplified to look like:

Data Step2;

     set Step1;

Keyword = FindParse(URL, 'keyword')

Sort = FindParse(URL, 'sortOption')

AutoRedirect = FindParse(URL, 'autoRedirect')

SID = FindParse(URL, 'sid')

PSID = FindParse(URL, 'psid')

run;

That's even better....

Frequent Contributor
Posts: 104

What is my macro missing?

Absolutely agree with your thinking and approach, it's the only sane way to go. You should be able to wrap that up in a nice macro that emits the proper data step code.

For full flexibility, I think chang_y_chung's approach using regular expressions is probably the best approach, if you're comfortable with regular expressions.  That's what they are designed to do.  Data step substring manipulations for serious text work is very tedious and less powerful.  Don't know much about SQL's ability to manipulate strings, I suspect it's less developed than regular expressions.

There are really good papers on SAS regular expressions, like this one:  www2.sas.com/proceedings/sugi29/265-29.pdf

Good luck, and Happy Coding!

Contributor
Posts: 23

What is my macro missing?

Thanks!  I know a little about regular expressions, as I very lightly used them at my last job.  But I think I can read things like what you posted, and learn enough to get this macro going. 

I'm going to work on it some more and I'll be sure to come back whether I get it working or not.

Thanks again!

Frequent Contributor
Posts: 138

What is my macro missing?

find(&VarName,&WhatItsLookingFor)   is a data function which is being put into macro conditional statement.

Regular Contributor
Posts: 241

Re: What is my macro missing?

Below is one way to do what I think you want to do.

   /* test data */
   data one;
      infile cards firstobs=2;
      input line :$60.;
   cards;
   ----+----1----+----2----+----3----+----4----+
   communities.sas.com/message/100363#100363
   communities.sas.com/message/cm=100363
   communities.sas.com/message/cm=100363#100364
   communities.sas.com/message/cm=?100363
   communities.sas.com/message/cm=1&cm=2&
   ;
   run;

   /* get the first word after 'cm=', delimited by
      a '&', '?', '#', or '/'; otherwise return the input
      line as is.
      notice that it returns:
      - input as is, when no delimiters found (2nd line)
      - a missing, when there is nothing between cm= and
        a delimiter (4th line)
      - the last one that qualifies (5th line)
   */
   %macro extract(from=, at=, dlm=\&\?\#\/);
      (prxchange("s|^.*&at.([^&dlm]*)[&dlm].*$|$1|",1,&from))
   %mend  extract;

   data two;
      set one;
      length cm $60;
      cm = %extract(from=line, at=%str(cm=));
   run;

   /* check */
   proc print data=two;
      var cm;
   run;
   /* on lst
   Obs    cm

    1     communities.sas.com/message/100363#100363
    2     communities.sas.com/message/cm=100363
    3     100363
    4
    5     2
   */

Contributor
Posts: 23

What is my macro missing?

Thank you for this.  There's a lot in here that I've never seen before.  For instance I've never seen prxchange, %str, nor have I used the slashes as you have in your dlm variable.  And now that I see how you've defined dlm, it might make more sense to create a global macro variable:

%let EndChars = \&\?\#\/\ \\;

Then set the dlm default as &EndChars.  I'm not sure how to do it so it includes all the characters you've included, as well as a space and nothing.  Those last two would need to be included as well.

But again, thank you for this.  It was very helpful.

Contributor
Posts: 23

What is my macro missing?

So, I decided to write this version so that I would have something to use now.  Then get used to regular expressions and rewrite the macro using them.  And the strange this is, the macro runs just fine in SAS, no errors.  But it doesn't actually do anything to the data set.  Like, the data set comes out looking exactly the same as it went in.  So there was no data processing performed by SAS in the data step.

Here's the macro:

%let EndChar = '&' '?' '#' '/';

%let SearchString = URL;

%let ParsedDataName = 'Display Parameter Value';

%macro FindParseAfter(Where2Start, VarName=&SearchString, Number2Skip=lengthc(&Where2Start), Where2Stop=&EndChar, Number2Keep=1, OutVarName=&ParsedDataName);

 

                              %if %sysfunc(find(&VarName,&Where2Start)) %then %do;

                                        length = %sysfunc(find(&VarName, &Where2Start)) + &Number2Skip;

 

                                        ParsedBeginning= substr(&VarName,length,360);

                                                  %if %sysfunc(find(ParsedBeginning, &Where2Stop)) %then %do;

                                                            &OutVarName= scan(ParsedBeginning,&Number2Keep,&Where2Stop);

                                                  %end;

                              %end;

%mend;

Here's the SAS output:

508  data stacked_1 ;

509      set stacked_3;

510

511      %FindParseAfter('cm=');

512

513

514  run;

NOTE: There were 59840 observations read from the data set WORK.STACKED_3.

NOTE: The data set WORK.STACKED_1 has 59840 observations and 9 variables.

NOTE: DATA statement used (Total process time):

      real time           5.42 seconds

      cpu time            0.45 seconds

No errors.  No problems.  Except it didn't create the new variables and it didn't do any of the Parsing.  How should I start debugging when I'm not even getting an error?  Maybe I should just start on regular expressions now and forget this code.  But it'd be nice to have something that works.

Super Contributor
Super Contributor
Posts: 3,174

What is my macro missing?

Diagnostic techniques to consider:

1) turn on the macro-generated source code:  OPTIONS SOURCE SOURCE2 MACROGEN SYMBOLGEN MPRINT /* MLOGIC */;

2) add this line at various points in your DATA step to print-out on the SAS log the SAS variable contents as you perform one DATA step pass (change each "nn" value to be unique for identification in the log):

PUTLOG '>DIAG-nn>' / _ALL_;

3) perform testing with a smaller subset of unique-characteristic input (data / observations) for exercising various code path scenarios.

Scott Barry

SBBWorks, Inc.

Super User
Super User
Posts: 6,500

What is my macro missing?

What is your macro missing?

You are not parameterizing the problem correctly.  You should pass in the optional parts of your SAS program to the macro and use them to generate the data step code needed to solve your problem. Because your data step code involves mulitple statements you will need to pass the target variable into the macro. So if we make IN be the parameter with the string to search and KEY be the parameter with the text to search for and OUT be the parameter with the name of the target dataset variable.

Then your original program becomes:

data stackedtest1;

     set stacked_3;

     %findparse(in=URL,key=’cm=’,out=cmOutcome);

run;

In the definition of your macro just replace the references to the original variables and strings with the macro parameter names.  For example the FIND function call now becomes:

   find(&in,&key)

Frequent Contributor
Posts: 75

Re: What is my macro missing?

The problem can also be solved by using perl regular expression.

Please look at the attached program.

Thanks

Dhanasekaran R

Attachment
Contributor
Posts: 23

What is my macro missing?

That is some awesome looking code you wrote there.  Thanks for that!

But I don't understand regular expressions at all. So anything coded in regular expressions will have to be tabled until I understand them better. That's why I'd like to get this running as it is, get everything fine and dandy with this code, then spend the next however many months learning about regex and improving on the working code.

Contributor
Posts: 23

What is my macro missing?

I'm not exactly sure what you mean.... because "find(&in,&key)" looks the same as "find(&VarName,&Where2Start)" to me. Why would I need to rename them?

I think you're saying SAS doesn't like my defaults, which is too bad.  This code would be very useful and very simplified if I could get it to work with these defaults.  Then it's just a matter of assigning some defaults in the beginning.  And if anything changes throughout the code, I can write it as you wrote it (with the "="). But part of the point in all this is to have as clean as code as possible.  And "%FindParseAfter('cm=');" is about as clean as I think I can make it.

Or I guess it could be the "circular-esque" reference "Number2Skip=lengthc(&Where2Start)".... maybe SAS doesn't like defining a default by using a varible defined in that macro.... idk.

But I might have to give in.  Because in the end I need something that can compute.  So this simplified/awesome version of the macro might have to give way to a version that works.

I still haven't figured out what's wrong.  But I've figured out how to debug.  I'm going to start over from the beginning.  And rerun & rebuild until the code breaks.  Then I'll try to figure out a way to fix it.  But likely I'll be back on here.... Smiley Happy

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 3399 views
  • 6 likes
  • 7 in conversation