SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GGO
Obsidian | Level 7 GGO
Obsidian | Level 7
I understood Quentin's explanation. It does not explain the inconsistencies. In Macro, a null string is a valid argument. %SYSFUNC knows that for some functions.

I think the common failure/bug may be for single-argument functions. %SYSFUNC simply does not acknowledge the open-paren and close-paren as delimiters - only the comma that separates more than one argument. So it could be that this bug only affects single-argument DATA STEP functions.

Just a guess, considering all the insights provided here.
GGO
Obsidian | Level 7 GGO
Obsidian | Level 7

This makes the cost of backward compatibility more explicit.

 

Why in the world should SAS double-down for decades on different internal representation for real numbers (including integers) and separately for integers? At least that's what this looks like to me:

 

Fixed thanks to FreelanceReinhard, above.

 

 

171  data _null_;
172    sum = 0;
173    one = 1;
174    do i = 1 to 11;
175      sum = sum + 1/10;
176      if sum = one then
177         put 'got to ' one '(' one hex16. '), '
178              sum= '(hex16: ' sum hex16. ')';
179      else put 'still not ' one '(hex16: ' one hex16. '), '
180               sum= '(hex16: ' sum hex16. ')'
181               ;
182    end;
183  run;

still not 1 (hex16: 3FF0000000000000), sum=0.1 (hex16: 3FB999999999999A)
still not 1 (hex16: 3FF0000000000000), sum=0.2 (hex16: 3FC999999999999A)
still not 1 (hex16: 3FF0000000000000), sum=0.3 (hex16: 3FD3333333333334)
still not 1 (hex16: 3FF0000000000000), sum=0.4 (hex16: 3FD999999999999A)
still not 1 (hex16: 3FF0000000000000), sum=0.5 (hex16: 3FE0000000000000)
still not 1 (hex16: 3FF0000000000000), sum=0.6 (hex16: 3FE3333333333333)
still not 1 (hex16: 3FF0000000000000), sum=0.7 (hex16: 3FE6666666666666)
still not 1 (hex16: 3FF0000000000000), sum=0.8 (hex16: 3FE9999999999999)
still not 1 (hex16: 3FF0000000000000), sum=0.9 (hex16: 3FECCCCCCCCCCCCC)
still not 1 (hex16: 3FF0000000000000), sum=1 (hex16: 3FEFFFFFFFFFFFFF)   <=== eeerrp !!
still not 1 (hex16: 3FF0000000000000), sum=1.1 (hex16: 3FF1999999999999)

 

Excuses abound. None seem satisfying.

Tom
Super User Tom
Super User

@GGO wrote:
I understood Quentin's explanation. It does not explain the inconsistencies. In Macro, a null string is a valid argument. %SYSFUNC knows that for some functions.

I think the common failure/bug may be for single-argument functions. %SYSFUNC simply does not acknowledge the open-paren and close-paren as delimiters - only the comma that separates more than one argument. So it could be that this bug only affects single-argument DATA STEP functions.

Just a guess, considering all the insights provided here.

But you are are the one that is asking for inconsistency.  You want SAS to treat this function call

%sysfunc(trimn())

differently just because you used macro code to generate the function call.

 

As @Quentin explained give %sysfunc() something as an argument to the function you are asking it to call. 

1972  %let x=%str();
1973  %put |%sysfunc(trimn(&x))|;
||

Then it doesn't cause an error because you called a function, TRIMN(), that requires at least one argument without passing it anything. 

 

GGO
Obsidian | Level 7 GGO
Obsidian | Level 7

I can't get my head around a NULL STRING not being a valid argument.

For COMPRESS() and PRXMATCH() two NULL STRINGS are valid arguments.

Why isn't one NULL STRING also a valid argument?
That to me would be more consistent.

 

Perhaps the inconsistency that bother me is that %SYSFUNC() DATA STEP functions recognize a comma (,) are delimiting two NULL STRINGS. Yet single-argument calls fail to recognize the open/close parens as delimiting a single NULL STRING.

 

That for me is the fundamental inconsistency, for which I see no explanation:

  • ONE null argument - ERROR!
  • TWO or more null strings - BACK IN BUSINESS!

That for me is fundamentally inconsistent. But at least it gives me a rule of thumb for recognizing situations that require %length() checks or similar. That's what I was after.

Tom
Super User Tom
Super User

I have no idea how SAS has implemented %SYSFUNC() but here is a model that make some sense for the behavior you are seeing.  %SYSFUNC() has a list somewhere of each function it can call and how many required arguments the function has.  When it is checking the syntax it receives it compares the number of values it has inside of its () to that number.   Note this is probably done AFTER the previous rounds of macro processor scanning has replaced all of the macro variable and macro function references.  

 

TRIMN() requires at least one argument. As you can see if you try this code in a data step.

x = trimn();

So when you call it without any values you get an error because the number of values does not match. Again, it does not matter whether you do it by just typing this

%let x=%sysfunc(trimn());

Or get there via a more complicated route, like:

%let y=;
%let x=%sysfunc(trimn(&y));

you still get the value count error.  It doesn't really matter, at this point, whether the value being passed is an empty string or not, because no value is being passed at all and that is all that is being checked.  So when you pass it a macro quoted empty string, using %str() perhaps, now when %SYSFUNC() is counting values it see that macro quoted string as a value to pass.  Again it doesn't really care if the number of characters in the value is zero or not, that is for the data step function to deal with, it just cares that there are the right number of values for the number of required arguments.

 

When you call compress with just a comma between the parentheses then %SYSFUNC() has something it can test to see if the count of values provided match the count of arguments needed.  Since a comma separates values the presence of N commas means there are N+1 values provided.

Quentin
Super User

As background, I found this thread I started three years ago, where I was complaining that %SYSFUNC won't pass a null argument to a function with the argument list is empty. 

https://communities.sas.com/t5/SAS-Programming/SYSFUNC-and-null-arguments/m-p/412077#M100760

So I'm definitely not an apologist for SAS; I wish %SYSFUNC behaved like you want it to.

But, I do think it's consistent.

 

In the DATA step, when a function accepts arguments, if you do not pass an argument, it errors:

17   data _null_ ;
18     x=compress() ;
         --------
         71
ERROR 71-185: The COMPRESS function call does not have enough arguments.

19   run ;

Clearly the compiler sees the above as passing zero arguments, rather than passing a single null argument.

 

But if you pass a comma, it does not error:

20   data _null_ ;
21     x=compress(,) ;
22   run ;

And the above is interpreted (I believe) as passing two null arguments to compress.

 

 

We can see the same with tranwrd passing no arguments:

76   data _null_ ;
77     x=tranwrd() ;
         -------
         71
ERROR 71-185: The TRANWRD function call does not have enough arguments.

78   run ;

 

And tranwrd passing three arguments that are null:

79   data _null_ ;
80     x=tranwrd(,,) ;
                 -
                 159
ERROR 159-185: Null parameters for TRANWRD are invalid.

81   run ;

Note that above shows the DATA step compiler actually throws a different error message for "not enough arguments" than it does for "null parameters".  

 

In my testing, the DATA step seems consistent. If a function accepts arguments:

  • An empty argument list will error rather than pass a null value as the argument.
  • If you use a comma in the list it will pass null value(s) as the argument(s)

 

When you call a DATA step function via %SYSFUNC, the behavior is consistent.

184  %put %sysfunc(compress()) ;
ERROR: The function COMPRESS referenced by the %SYSFUNC or %QSYSFUNC macro function has too few
       arguments.
185  %put %sysfunc(compress(,)) ;
186  %put %sysfunc(compress(,,)) ;
187  %put %sysfunc(compress(,,,)) ;
ERROR: The function COMPRESS referenced by the %SYSFUNC or %QSYSFUNC macro function has too many
       arguments.
188  %put %sysfunc(tranwrd()) ;
ERROR: The function TRANWRD referenced by the %SYSFUNC or %QSYSFUNC macro function has too few
       arguments.
189  %put %sysfunc(tranwrd(,)) ;
ERROR: The function TRANWRD referenced by the %SYSFUNC or %QSYSFUNC macro function has too few
       arguments.
190  %put %sysfunc(tranwrd(,,)) ;
191  %put %sysfunc(tranwrd(,,,)) ;
ERROR: The function TRANWRD referenced by the %SYSFUNC or %QSYSFUNC macro function has too many
       arguments.

 

I disagree with your statement "My original test calls all have the right number of arguments. Some arguments are simply NULL STRINGS."

 

If you code:

%macro test_null_string_handling(str=);
  %put >>%sysfunc(STRIP(&str))<< ;
%mend test_null_string_handling;
%test_null_string_handling()

It is true that the local macro variable STR will have a null value, but this does not mean that you pass a null value to STRIP().  The macro processor will resolve &STR to null, and STRIP will have an empty argument list, which is not allowed.  Just the same as if you coded:

%macro test_null_string_handling(str=);
  %put >>%sysfunc(STRIP())<< ;
%mend test_null_string_handling;
%test_null_string_handling()

 

 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
GGO
Obsidian | Level 7 GGO
Obsidian | Level 7

I have understood and do see your points, Quentin. And I very much appreciate your perspective and insights.

 

But in several details, we will simply disagree.

 

1 - DATA STEP and %SYSFUNC macro (text-generation) world are different, and have their own sets of rules.

  • In DATA STEP, a blank CHAR has length 1.
  • In macro world, a blank string has %length 0. This fundamental disconnect definitely contributes to SAS' inconsistencies.
  • %SYSFUNC() should bring DATA STEP functions into the macro (text-gen) world. To me, this means that %SYSFUNC() should translate the DATA STEP blank-char concept into a zero-length NULL STRING.
  • This has been and remains my main point: %SYSFUNC() fails to do this most fundamental translation into the macro (text-gen) world consistently - OK for multiple arguments; ERROR for single arguments. I see no explanation for this (details follow).

2 - %SYSFUNC() apparently demands a comma to delimit arguments. It does not recognize the open/close parens as delimiting a single argument. This to me is a design failure, and I struggle to alter that assessment.

  • See this documentation: "All arguments in SAS language functions within %SYSFUNC must be separated by commas."
  • That's the fundamental design flaw. How in the world are you supposed to separate a single argument with a comma?! Not possible - so design oversight.
  • The open/close parens of the function call should be sufficient to delimit a single argument, including the NULL argument, which otherwise works fine for 2+ NULL arguments.
  • %SYSFUNC()'s focus on the comma has blinded it to the open/close delimiters.
  • NULL arguments are fine in the macro (text-gen) world. %SYSFUNC() simply does not make the translation from DATA STEP to Macro world, and I think this stems from the open/close-parens-as-delimiters issue.
  • Yes, I expect that we will disagree on this point 🙂
  • ( I have no issue with TRANWRD() and TRANSTRN() - function documentation clearly states when blank (null) arguments are valid - or not. )

3 - By comparison, I do consider STRIP() flawed and inconsistent with the macro world.

  • Taking your example, I see no difference between a NULL STRING (str=), and an EXPLICIT NULL STRING (str=%str()).
  • Both arguments are NULL STRINGS - strings with length zero.
  • The EXPLICIT NULL STRING simply helps %SYSFUNC() get over its fundamental failure to recognize a NULL STRING simply based on the open/close-paren delimiters:
%macro test_null_string_handling(str=);
  %put >>%sysfunc(STRIP(&str))<< ;
%mend test_null_string_handling;

%test_null_string_handling
%*;
%test_null_string_handling(str=%str())      /* <== NULL STRING ASSIST, to overcome %SYSFUNC() design flaw */
%*;

With the assist, despite any macro-quoting tokens/symbols, the DATA STEP function STRIP() nonetheless receives a NULL STRING.

 

The results - the assist of %str() eases %SYSFUNC() over its design flaw:

 

323  %test_null_string_handling
324  %*;
ERROR: The function STRIP referenced by the %SYSFUNC or %QSYSFUNC macro function has too few arguments.
>><<
325  %test_null_string_handling(str=%str())     /* <== NULL STRING ASSIST, to overcome %SYSFUNC() design flaw */
>><<
326  %*;

4 - %SYSFUNC() simply cannot recognize otherwise valid single-null-string-arguments.

 

  • That's a %SYSFUNC() design flaw.
  • Yes, I expect that we will disagree on this point, as well 🙂

This has been a very helpful, and at times very frustrating thread for me. I thank all who have contributed. Such frustration, for me, is a mark of insight and learning. This is what you have all given to me.

 

Thank you!

Tom
Super User Tom
Super User

Another point to consider is that a macro processor is not a full blown language.  It just processes the macro triggers and passes the resulting text down stream to the next step in parsing.

 

For the macro processor to treat the way it handles evaluation of &y differently just because you happen to want to pass the result to %sysfunc(trimn()) does not seem logical to me.

 

That said SAS couldn't have decided to implement something like treating %sysfunc(trimn()) the same as %sysfunc(trimn(%str())), but that would cause trouble for functions that take no arguments.

2664  %put %sysfunc(date(%str()));
ERROR: The function DATE referenced by the %SYSFUNC or %QSYSFUNC macro function has too many arguments.

2665  %let x=;
2666  %put %sysfunc(date(&x));
22060

So they would have had to add extra logic to figure out if a function takes any arguments, is the value actually empty (null) or not, etc.

 

They might have been able to do that, but that didn't.

GGO
Obsidian | Level 7 GGO
Obsidian | Level 7
We will fundamentally disagree, Tom. My assessment remains the same - SAS got it wrong. Yours remains the same as well - Blame the user.

"the macro processor to treat the way it handles evaluation of &y differently" - Wrong. I'm asking it to handle evaluation consistently. null is null, whether like (str=) or (str=%str()).

"just because you happen to want to pass the result to %sysfunc(trimn())" - And why would I not want to pass a valid condition through SAS' own interface to DATA STEP functions.

Blaming the user is not the solution in this case.
Tom
Super User Tom
Super User

No one blaming the user, just trying to explain how it works.

 

Another way to word it is that you need adjust your internal model of how SAS works to more closely match how it actually works.  As I said the macro processor is not a programming language.  The %SYSFUNC() macro function has to do a lot of work that normally the data step compiler would do with much more context that is available to a simple macro processor. So they had to make decisions about how to handle these types of ambiguous situations.  I wouldn't call that being inconsistent.

Because of the choices they made one of these two statements causes an error and the other doesn't.

%let x=;
%put %sysfunc(trimn(&x));
%put %sysfunc(date(&x));
GGO
Obsidian | Level 7 GGO
Obsidian | Level 7

Well, there we do agree, Tom.

 

Thanks to this discussion, and all of the insights I've gained from all of you, I have adjusted my internal model of how SAS works.

 

I now have a pretty good idea of how to overcome and protect against this fundamental flaw in %SYSFUNC() argument parsing of null strings.

Quentin
Super User

Actually, I agree with many of your points. It's a bummer that when you call %sysfunc and give the called function an empty argument, it will not pass a null argument to the function.

 

I would much prefer it if 

%let list=;
%put %sysfunc(countw(&list));

would return 0, instead of an error because of the empty argument to countw().

 

That was the point of my thread:  https://communities.sas.com/t5/SAS-Programming/SYSFUNC-and-null-arguments/m-p/412077#M100760

 

So, like you, I wish it worked differently.  


But I do see the behavior as consistent with the documentation, and also consistent across functions.  

 

Yes, fun discussion.

 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 26 replies
  • 3412 views
  • 8 likes
  • 5 in conversation