Base SAS 9.4.
I am trying to thoroughly understand Macro Quoting (‘masking’) within the context of ‘SAS Processing’ (Input Buffer / Word Scanner / Macro Processor / Symbol Tables (in this case Global only) / Macro Catalog / Compiler)
SAS code examples provided are quite simple but I am trying to breakdown the SAS Process to determine exactly where "unquoting is done for you" as referenced below.
FYI: I really could not find a more abbreviated manner in which to breakdown the various parts of the SAS Process .... so thanks in advance.
/* EXAMPLE 1 */
%let mv1=data test; var='a'; run; 
%put &=mv1;
%put _user_;
I understand why Example 1 would fail
Macro Processor
begins to 'execute' the %let statement
adds MV1 to Global Symbol Table
mv1 data test (<< no semi)
The ';' represents an end of statement
";var='a'; run;" is never added to the GST
ERROR 180-322: Statement is not valid or it is used out of proper order.
/* EXAMPLE 2 */
%let mv2=%str(data test; var='a'; run;); 
%put _user_; 
%put &=mv2;
I understand why we need to use masking as in Example 2
Macro Processor
begins to 'execute' the %let statement
adds MV2 (masked) to Global Symbol Table
mv2 data test; var='a'; run;
>> NOTE: %put _user_; 'shows us' those delta characters
/* create macro variable*/
%let mv3=%str(data test; var='a'; run;); 
%put _user_; 
%put &=mv3;
/* Example 3a)*/
%macro test;
%put _user_;
&mv3;
%mend test;
%test;
I might expect that Example 3a would fail (given that there is no explicit %unquote to ‘unmask’ mv3 (the masked string found in the Global Symbol Table)).
However, does Example 3a fall under this description (from a SAS blog)?
>>> There are three cases where the unquoting is done for you:
And, thus Example 3b (with explicit %unquote) further below is just not necessary?
I have tried to breakdown Example 3a) below in the context of SAS Processing to understand precisely where 'unquoting is done for you'
NOTE: the masked string mv3=%str(data test; var='a'; run;); .. already exists in the Global Symbol Table
Input Buffer
%macro test;
&mv3;
%mend test;
%test;
Word Scanner
detects %macro key word
MP is triggered
Macro Processor
During macro compilation
creates entry in the Macro Catalog
pulls tokens from the Input Buffer to the Macro Catalog until %mend
> %IF etc. as macro instructions
> noncompiled items as text
Macro Catalog
%macro test;
&mv3;
%mend test;
Input Buffer
%test;
Word Scanner
detects % trigger
Macro Processor is triggered
Macro Processor
begins to execute the macro
recognizes non-compiled text
places &mv2 into the IB
Input Buffer
&mv3.;
Word Scanner
continues to tokenize from the IB
recognizes macro trigger '&'
triggers MP
Is it here that the 'item' (the masked string from the Global Symbol Table) leaves the word scanner and is passed to the SAS Macro Facility and therefore, ‘unquoting is done for you’?
Macro Processor
looks in Local (and then Global) Symbol Table and resolves &mv2
>>> mv3 is now clearly unmasked <<<
&mv3; >>> (data test; var='a'; run;)
places non-compiled text into the Input Buffer
recognizes %mend
macro test ceases execution
Input Buffer
data test; var='a'; run;
Word Scanner
continues to tokenize from the IB
recognizes DATA as step boundary
triggers the DATA step compiler
Compiler
data test; var='a'; run;
The compiled DATA step is executed
The DATA step compiler is cleared
Therefore, Example 3b (where %unquote is explicit) is just not needed?
/* Example 3b)*/
%macro test;
%put _user_;
%unquote(&mv3);
%mend test;
%test;
Yes, I would agree with:
"I’m not certain that the MP ever <directly> returns text to the WS. The WS reads from the IB - aka Input Buffer/ Stack - (only and ever? ..aka “tokenizes the text”)
Here's another reference. The great Russ Lavery has a series of "animated guide" talks where he shows various processes in action. He has an animated guide talk for macro quoting. The paper is: https://www.lexjansen.com/wuss/2016/134_Final_Paper_PDF.pdf . But the paper doesn't do it justice. It's a thing of beauty to watch an animated powerpoint slide showing code and tokens flowing from the input stack, through the word scanner, and beyond. If you ever hear of Russ doing that talk live, you don't want to miss it.
Good luck on your macro adventure. it's fun stuff!
Have you seen this ??
Quoting in SAS® Macro Programming Q&A, Slides, and On-Demand Recording
Started 10-19-2022
Modified 10-19-2022
Views 924
https://communities.sas.com/t5/Ask-the-Expert/Quoting-in-SAS-Macro-Programming-Q-amp-A-Slides-and-On...
Koen
Hmm, I don't really like your second unquoting rule:
>>> There are three cases where the unquoting is done for you:
- The %UNQUOTE function was used
- The item leaves the word scanner and is passed to the DATA step compiler, SAS Macro Facility, or other parts of the SAS System.
- The value is returned from the %SCAN, %SUBSTR, or %UPCASE function
But it's consistent with what I see in the docs:
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/mcrolref/p1f5qisx8mv9ygn1dikmgba1lmmu.htm
Here's how I think about it. Macro quoting is part of the macro language. Macro quoting symbols have no meaning outside of the macro language. Therefore, when the macro processor returns text to the word scanner, it should automatically unquote the text. This is necessary so that the word scanner can appropriately build tokens for the DATA step compiler, PROC interpreter, or whatever part of SAS is appropriate.
So in my head, when you code:
%let mv3=%str(data test; var='a'; run;); 
%macro test;
%put _user_;
&mv3
%mend test;
%test;It makes sense that it works, because the macro processor unmasks the semicolons before sending them to the word scanner. So the macro processor resolves the reference to MV3, and gets a value with quoting symbols in it. Then when it returns that value to the word scanner, it unmasks the semicolons.
I suppose it's possible that the word scanner is doing the unquoting (as the docs suggest), but that doesn't really help with my understanding of the process. And as these are all logical constructs, I'm happy to stick with the idea that the macro processor is responsible for unquoting values before returning them to SAS.
I've been thinking about this more, and reading more, and I think your post, and the documentation, are correct that it is the word scanner that is doing the unquoting. not the macro processor.
This part of docs has a nice walk through of the macro execution process, showing what is returned to the input stack, etc. I think it agrees with your writeup. It doesn't address quoting. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/mcrolref/p0eksviivbw6bwn1eg6h2crvl7ct.htm
This page of the docs seems pretty clear that the word scanner is doing unquoting before passing tokens on, which still surprises me. But I guess it's believable. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/mcrolref/p1f5qisx8mv9ygn1dikmgba1lmmu.htm
This page of the docs says the macro processor does the un-quoting: "When the macro processor is finished with a macro quoted text string, it removes the macro quoting-coded substitute characters and replaces them with the original characters. The unmasked characters are passed on to the rest of the system." https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/mcrolref/n0tmct0ywyatobn19gdnxdjvpa89.htm That's how I like to think about it. But I think it's wrong (at least it's inconsistent with the other part of the docs).
Susan O'Connor wrote one of the foundational papers on macro quoting, which includes this statement: "The word scanner knows how to handle delta characters and treats them as tokens when appropriate." https://www.lexjansen.com/nesug/nesug99/bt/bt185.pdf So that is consistent with the documentation saying it is the word scanner, not the macro processor, which does unquoting. If the macro processor did the unquoting, the word scanner would not need to know about delta characters. Since you're interested in macro quoting, definitely read this paper.
My other favorite macro quoting paper is Ian Whitock's: https://support.sas.com/resources/papers/proceedings/proceedings/sugi28/011-28.pdf . That paper suggests that macro facility is doing the unquoting (but this paper never mentions the word scanner, so perhaps this is an intentional over-simplification).
In a sense, I think it doesn't really matter whether the macro processor unquotes values, or the word scanner unquotes them. It's even possible that they could work together on unquoting, and as new versions of SAS were released, the implementation of how unquoting is handled may have evolved.
Quentin,
Many thanks for the quite valuable feedback and insight. I do appreciate you doing a deep dive and providing additional references.
“How the Macro Processor Executes a Compiled Macro” / “Restoring the Significance of Symbols” / Susan O’Connor pdf
My recent observations also align with "the word scanner knowing how to handle delta characters and treating them as tokens when appropriate."
Furthermore, not sure if you agree with this ad-hoc assessment
"I’m not certain that the MP ever <directly> returns text to the WS. The WS reads from the IB - aka Input Buffer/ Stack - (only and ever? ..aka “tokenizes the text”) and then sends any macro triggers to the MP."
My walnut size SAS brain needs to still 'see it in context' so the attached Excel doc outlines two simple processes where one produces an error (due to NO unmasking) and the other one succeeds (due to unmasking). I believe this demonstrates 'in context':
i) that indeed the WS can handle delta characters (and would still send a masked value to the IB)
ii) that 'unquoting' / 'unmasking' is required at times (it cannot always "be done for you")
iii) exactly where the compile-time error occurs
Yes, I would agree with:
"I’m not certain that the MP ever <directly> returns text to the WS. The WS reads from the IB - aka Input Buffer/ Stack - (only and ever? ..aka “tokenizes the text”)
Here's another reference. The great Russ Lavery has a series of "animated guide" talks where he shows various processes in action. He has an animated guide talk for macro quoting. The paper is: https://www.lexjansen.com/wuss/2016/134_Final_Paper_PDF.pdf . But the paper doesn't do it justice. It's a thing of beauty to watch an animated powerpoint slide showing code and tokens flowing from the input stack, through the word scanner, and beyond. If you ever hear of Russ doing that talk live, you don't want to miss it.
Good luck on your macro adventure. it's fun stuff!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
