BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dcortell
Pyrite | Level 9

I have a set of values that must be included within a macro variable. The set of values exceed the maximum amount of characters that can be stored in a macro variable  (around 65K), so split the total amount of characters within multiple macro variables, and then grouped the set of macro variables as nested within a major macro variable.

 

The code utilized to create both the first level macro variables and the major macro variable is the following:

 

%macro max_macro(data=,var=,macro_var=);
*Determine the number of individual submacros needed;

data &data.2;
 set &data end=eof;
 length_var=length(cat('"',strip(&var),'"n'));
 if _n_=1 then sum_length=length_var;
 else sum_length+length_var;
 /**max lenght macro var 65500 chars - proxy 50k so to
    have margin for the additional ""n elements **/

 /**as soon as reach a term where total sum 
 >50k then it is assigned to group+1 **/

 /**Returns the largest integer less than or equal to 
    a numeric value expression.**/

 group=floor(sum_length/50000)+1;
 if eof then call symput('num_submacro',put(group,best.));
run;

%put Number of sub macros needed= %cmpres(&num_submacro);

*Macrotize the submacros*****************************;
%do i= 1 %to &num_submacro;

%global &macro_var submacro&i;

proc sql noprint;
 select cat('"',strip(&var),'"n')
 into:submacro&i separated by ' '
 
 from &data.2
 where group =&i;
quit;
%end;

*Append the submacros into single master macro;
data submacro;
 do i = 1 to &num_submacro.;
 submacro = cats('&submacro',i);
 output;
 end;
run;

proc sql noprint;
 select submacro into:&macro_var separated by ' ' from submacro;
quit;

%mend max_macro;

/*** use the macro***/

%max_macro(data=col_tr_biggroup,
var=name,
macro_var=bigrams);

The macro seems to generate both the sub macro variables and the major macro variable with success.

 

The sample set is composed by 3085 observations with a length per observation of 5 characters, with a proxy of 15.425 characters in total

 

Therefore, in the example, the macro will only generate one submacro1, which is then included in the major macro "bigrams"

 

However, when I try to use the created macro var in the following procedure, I get the following error:

proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
var &bigrams;
ranks &rank_bigrams;
run;

MPRINT(NB_USE):   proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
SYMBOLGEN:  Macro variable BIGRAMS resolves to &submacro1
SYMBOLGEN:  Macro variable SUBMACRO1 resolves to "
NOTE: The quoted string currently being processed has become more than 262 bytes long.  You might have unbalanced quotation marks.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.
ERROR: Maximum level of nesting of macro functions exceeded.

I checked and the error in the documentation is linked to:

 

CauseSolution

The macro functions are nested more than ten times.
You cannot nest macro functions more than ten times.
An attempt was made to assign a macro variable to itself. A macro quoting function was used inside a %DO loop that is going beyond ten iterations.
Many times a macro quoting function is used when it is not needed. If this is the case, then remove the quoting function.

 

A. The sub macros are not nested more than 10 times as explained

B.the submacro and the major have different names and so can't be assigned to themselves NOR is used any quoting function

 

At this point I have no clue what could be generating the error. Any hints appreciated

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Try transposing this data

2. all the BIGRAMS as columns encountered in the totality of the dataset (more below), which represent the number of times that bigram has been encountered in the text of that URL

(Example)

url a ab ba etc...
www.dog.com 1 2 3

 

 

So that the structure is

URL   BIGRAM   Count

 

And Sort and Process BY Bigram using the Count to determine ranks.

 

Example with some dummy data:

data example;
   do bigram= 'ab','ba','a','b','ac';
      do url=1 to 40;
         count= rand('integer',25);
         output;
      end;
   end;
run;
proc sort data=example;
   by bigram;
run;

proc rank data=example groups=10 out=Rank_score;
by bigram;
var count;
ranks rankcount;
run;

I know I used URL as a number. That was just to have a place holder variable but transposing your BY URL should create something that looks a lot like that Example data set.

View solution in original post

7 REPLIES 7
ballardw
Super User

How do you intend to use this (very probably too convoluted) set of macro variables? As in show some code using them.

 

Since you apparently are creating name literals of some sort then it appears you are actually working with either data set or variable names so what are you doing with them? Quite often putting that many related values into a macro variable indicates that a process approach may be suboptimal.

 

BTW you may want to look up the NLITERAL function to simplify all that concatenation of quotes and N to text.

Quentin
Super User

Generally when you're trying to put that much data into macro variables, it's a sign that there is a better way to do it.  The macro language isn't great for storing big data.

 

That said, I was curious, and when I run your code with a small test case:

 

data class ;
 set sashelp.class ;
 do i=1 to 10 ;
 output ;
 end ;
run ;

%max_macro(data=class,var=name,macro_var=myname)

%put &myname ;
%put _user_ ;

It runs without error, and seems to be doing what you hope.  Note that you have a message about unmatched quotes in the log, which I don't get.  So that could be a problem resulting from quote marks in the values you have. It could also be that your SAS session has unmatched quotes so it thinks you have nested macro definitions.

 

 

If I increase the number of records to:

data class ;
 set sashelp.class ;
 do i=1 to 1000 ;
 output ;
 end ;
run ;

%max_macro(data=class,var=name,macro_var=myname)

%put _user_ ;
%put %superq(myname) ;
%put &myname ;

I then get flooded with errors, because &myname resolves to &submacro1 &submacro2 &submacro3 &submacro4, which then resolves to a string too long for the macro processor to handle.

 

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ballardw
Super User

@Quentin 

I did something similar but without creating enough extra records from the SASHELP.CLASS data set to require additional "submacro" levels and did not see anything about unbalanced quotes.

 

I do suspect there may be an issue with the "count needed characters and split the macro" because this bit of code for counting

 length_var=length(cat('"',strip(&var),'"n'));
 if _n_=1 then sum_length=length_var;
 else sum_length+length_var;

is not including anything for spaces between the name literal values but then stuffs them into macro variables with

 select cat('"',strip(&var),'"n')
 into:submacro&i separated by ' '

which I suspect almost certain means that the Submacro variable has more characters, the spaces between the name literals, than were accounted for in the character counting process. Which is very likely going to mean that some quote isn't closed when the actual length exceeds that of a single macro variable.

dcortell
Pyrite | Level 9

@Quentin @ballardw thanks for the replies.

 

Yes, with few records the code works as expected.

 

To provide more insights on the task, I'm performing a text minig tasks based on a dataset of text data. 

 

- the dataset "TR_Biggroup" include a set of records, where for each line we have:

 

1. a url column working as ID

2. all the BIGRAMS as columns encountered in the totality of the dataset (more below), which represent the number of times that bigram has been encountered in the text of that URL

(Example)

urlaabbaetc...
www.dog.com123 

 

- The dataset "col_tr_bigroup" is a dataset composed by a single column "Name" and including all the BIGRAMS derived by the text dataset

So, let's say as example my text dataset is composed by a single record with text "I love dog", the col_tr_bigroup willl be composed by a single text column "Name" with the following values in the records:

NAME
I
 L
Lo
ov
ve
e
etc...

 

The macro %max_macro is supposed to store all the bigrams of the dataset in the macro variable &bigram, in order to perform tasks related to the dataset, like:

 

proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
var &bigrams;
ranks &rank_bigrams;
run;

 

 

ballardw
Super User

Try transposing this data

2. all the BIGRAMS as columns encountered in the totality of the dataset (more below), which represent the number of times that bigram has been encountered in the text of that URL

(Example)

url a ab ba etc...
www.dog.com 1 2 3

 

 

So that the structure is

URL   BIGRAM   Count

 

And Sort and Process BY Bigram using the Count to determine ranks.

 

Example with some dummy data:

data example;
   do bigram= 'ab','ba','a','b','ac';
      do url=1 to 40;
         count= rand('integer',25);
         output;
      end;
   end;
run;
proc sort data=example;
   by bigram;
run;

proc rank data=example groups=10 out=Rank_score;
by bigram;
var count;
ranks rankcount;
run;

I know I used URL as a number. That was just to have a place holder variable but transposing your BY URL should create something that looks a lot like that Example data set.

Quentin
Super User
I wish I could give @ballardw's answer more than one like. This is a great example where having wide data, with information stored in variable names, makes coding much more difficult. And if you reshape the data to vertical, putting the information into values, the coding becomes almost trivial.
BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
yabwon
Onyx | Level 15

If I were you, instead creating 1 macrovariable with a "billion" of values in it, I would create a list of macrovariables (with one value each) and then used them with a %do-loop.

%let var  = name;
%let data = col_tr_biggroup;

proc sql;
select cat('"',strip(&var),'"n') 
into :alist1-
from &data.
;
%let nobs = &sqlobs.;
quit;



proc rank data=TR_BIGGROUP groups=10 out=Rank_score;
var 
  %macro myLoop(I);
    %do I = 1 %to &nobs.;
      &&alist&I
    %end; 
  %mend;
  %myLoop()
;
ranks &rank_bigrams;
run;

Bart

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2354 views
  • 2 likes
  • 4 in conversation