I have a set of values that must be included within a macro variable. The set of values exceed the maximum amount of characters that can be stored in a macro variable (around 65K), so split the total amount of characters within multiple macro variables, and then grouped the set of macro variables as nested within a major macro variable.
The code utilized to create both the first level macro variables and the major macro variable is the following:
%macro max_macro(data=,var=,macro_var=);
*Determine the number of individual submacros needed;
data &data.2;
set &data end=eof;
length_var=length(cat('"',strip(&var),'"n'));
if _n_=1 then sum_length=length_var;
else sum_length+length_var;
/**max lenght macro var 65500 chars - proxy 50k so to
have margin for the additional ""n elements **/
/**as soon as reach a term where total sum
>50k then it is assigned to group+1 **/
/**Returns the largest integer less than or equal to
a numeric value expression.**/
group=floor(sum_length/50000)+1;
if eof then call symput('num_submacro',put(group,best.));
run;
%put Number of sub macros needed= %cmpres(&num_submacro);
*Macrotize the submacros*****************************;
%do i= 1 %to &num_submacro;
%global ¯o_var submacro&i;
proc sql noprint;
select cat('"',strip(&var),'"n')
into:submacro&i separated by ' '
from &data.2
where group =&i;
quit;
%end;
*Append the submacros into single master macro;
data submacro;
do i = 1 to &num_submacro.;
submacro = cats('&submacro',i);
output;
end;
run;
proc sql noprint;
select submacro into:¯o_var separated by ' ' from submacro;
quit;
%mend max_macro;
/*** use the macro***/
%max_macro(data=col_tr_biggroup,
var=name,
macro_var=bigrams);
The macro seems to generate both the sub macro variables and the major macro variable with success.
The sample set is composed by 3085 observations with a length per observation of 5 characters, with a proxy of 15.425 characters in total
Therefore, in the example, the macro will only generate one submacro1, which is then included in the major macro "bigrams"
However, when I try to use the created macro var in the following procedure, I get the following error:
proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
var &bigrams;
ranks &rank_bigrams;
run;
MPRINT(NB_USE): proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
SYMBOLGEN: Macro variable BIGRAMS resolves to &submacro1
SYMBOLGEN: Macro variable SUBMACRO1 resolves to "
NOTE: The quoted string currently being processed has become more than 262 bytes long. You might have unbalanced quotation marks.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
I checked and the error in the documentation is linked to:
CauseSolution
The macro functions are nested more than ten times. | You cannot nest macro functions more than ten times. |
An attempt was made to assign a macro variable to itself. A macro quoting function was used inside a %DO loop that is going beyond ten iterations. | Many times a macro quoting function is used when it is not needed. If this is the case, then remove the quoting function. |
A. The sub macros are not nested more than 10 times as explained
B.the submacro and the major have different names and so can't be assigned to themselves NOR is used any quoting function
At this point I have no clue what could be generating the error. Any hints appreciated
Try transposing this data
2. all the BIGRAMS as columns encountered in the totality of the dataset (more below), which represent the number of times that bigram has been encountered in the text of that URL
(Example)
url | a | ab | ba | etc... |
www.dog.com | 1 | 2 | 3 |
|
So that the structure is
URL BIGRAM Count
And Sort and Process BY Bigram using the Count to determine ranks.
Example with some dummy data:
data example; do bigram= 'ab','ba','a','b','ac'; do url=1 to 40; count= rand('integer',25); output; end; end; run; proc sort data=example; by bigram; run; proc rank data=example groups=10 out=Rank_score; by bigram; var count; ranks rankcount; run;
I know I used URL as a number. That was just to have a place holder variable but transposing your BY URL should create something that looks a lot like that Example data set.
How do you intend to use this (very probably too convoluted) set of macro variables? As in show some code using them.
Since you apparently are creating name literals of some sort then it appears you are actually working with either data set or variable names so what are you doing with them? Quite often putting that many related values into a macro variable indicates that a process approach may be suboptimal.
BTW you may want to look up the NLITERAL function to simplify all that concatenation of quotes and N to text.
Generally when you're trying to put that much data into macro variables, it's a sign that there is a better way to do it. The macro language isn't great for storing big data.
That said, I was curious, and when I run your code with a small test case:
data class ;
set sashelp.class ;
do i=1 to 10 ;
output ;
end ;
run ;
%max_macro(data=class,var=name,macro_var=myname)
%put &myname ;
%put _user_ ;
It runs without error, and seems to be doing what you hope. Note that you have a message about unmatched quotes in the log, which I don't get. So that could be a problem resulting from quote marks in the values you have. It could also be that your SAS session has unmatched quotes so it thinks you have nested macro definitions.
If I increase the number of records to:
data class ;
set sashelp.class ;
do i=1 to 1000 ;
output ;
end ;
run ;
%max_macro(data=class,var=name,macro_var=myname)
%put _user_ ;
%put %superq(myname) ;
%put &myname ;
I then get flooded with errors, because &myname resolves to &submacro1 &submacro2 &submacro3 &submacro4, which then resolves to a string too long for the macro processor to handle.
I did something similar but without creating enough extra records from the SASHELP.CLASS data set to require additional "submacro" levels and did not see anything about unbalanced quotes.
I do suspect there may be an issue with the "count needed characters and split the macro" because this bit of code for counting
length_var=length(cat('"',strip(&var),'"n')); if _n_=1 then sum_length=length_var; else sum_length+length_var;
is not including anything for spaces between the name literal values but then stuffs them into macro variables with
select cat('"',strip(&var),'"n') into:submacro&i separated by ' '
which I suspect almost certain means that the Submacro variable has more characters, the spaces between the name literals, than were accounted for in the character counting process. Which is very likely going to mean that some quote isn't closed when the actual length exceeds that of a single macro variable.
@Quentin @ballardw thanks for the replies.
Yes, with few records the code works as expected.
To provide more insights on the task, I'm performing a text minig tasks based on a dataset of text data.
- the dataset "TR_Biggroup" include a set of records, where for each line we have:
1. a url column working as ID
2. all the BIGRAMS as columns encountered in the totality of the dataset (more below), which represent the number of times that bigram has been encountered in the text of that URL
(Example)
url | a | ab | ba | etc... |
www.dog.com | 1 | 2 | 3 |
- The dataset "col_tr_bigroup" is a dataset composed by a single column "Name" and including all the BIGRAMS derived by the text dataset
So, let's say as example my text dataset is composed by a single record with text "I love dog", the col_tr_bigroup willl be composed by a single text column "Name" with the following values in the records:
NAME |
I |
L |
Lo |
ov |
ve |
e |
etc... |
The macro %max_macro is supposed to store all the bigrams of the dataset in the macro variable &bigram, in order to perform tasks related to the dataset, like:
proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
var &bigrams;
ranks &rank_bigrams;
run;
Try transposing this data
2. all the BIGRAMS as columns encountered in the totality of the dataset (more below), which represent the number of times that bigram has been encountered in the text of that URL
(Example)
url | a | ab | ba | etc... |
www.dog.com | 1 | 2 | 3 |
|
So that the structure is
URL BIGRAM Count
And Sort and Process BY Bigram using the Count to determine ranks.
Example with some dummy data:
data example; do bigram= 'ab','ba','a','b','ac'; do url=1 to 40; count= rand('integer',25); output; end; end; run; proc sort data=example; by bigram; run; proc rank data=example groups=10 out=Rank_score; by bigram; var count; ranks rankcount; run;
I know I used URL as a number. That was just to have a place holder variable but transposing your BY URL should create something that looks a lot like that Example data set.
If I were you, instead creating 1 macrovariable with a "billion" of values in it, I would create a list of macrovariables (with one value each) and then used them with a %do-loop.
%let var = name;
%let data = col_tr_biggroup;
proc sql;
select cat('"',strip(&var),'"n')
into :alist1-
from &data.
;
%let nobs = &sqlobs.;
quit;
proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
var
%macro myLoop(I);
%do I = 1 %to &nobs.;
&&alist&I
%end;
%mend;
%myLoop()
;
ranks &rank_bigrams;
run;
Bart
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.