I'm trying to figure out how to specify lengths of new character variables (i.e., that don't already exist in the dataset) whose names are stored in a macro. The way I stored the list of "words" in the macro looks like this: var1$20 var2$30 var3$15 [and so on], in anticipation of using them in a length statement (and format statement). I guess I can't do something like the following (where "len" is the name of the macro that contains the values like above):
%macro x(x); %local i next; %do i=1 %to %sysfunc(countw(&len)); %let next=%scan(&len, &i); data mydata; set mydata; where x=&x; length &next; format &next.; run; %end; %mend x; %x(1)
Is there something I can manipulate in this code to make it work, or is there a better way I can specify lengths of new character variables whose names are stored in a macro, where the length would be some numeric value associated with each variable-to-be?
Thanks!
What is it that you have in the "magic" macro variable LEN? (A "magic" macro variable is one referenced suddenly in the middle of macro definition that is not an input parameter, not a local macro and has no GLOBAL statement or even any comment that states where is values is expected to come from.)
Assuming you have set it to something like one of these:
%let len=var1$20 var2$30 var3$15 ;
%let len= var1 $20 var2 $30 var3 $15 ;
%let len= var1-var3 $30 ;
Then you should be able to just use it directly in the LENGTH statement.
data mydata;
length &len ;
set mydata;
run;
If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats.
data mydata;
length &len ;
set mydata;
format _character_;
run;
When a variable is present in an incoming dataset, the data step compiler will take its length when the SET statement is compiled. If you want to force another length, you need to have the LENGTH statement before the SET.
Hi Kurt,
Thanks for your input. I think my main issue here is getting the macro values to resolve correctly in the length statement. One manipulation I just made was add a second argument to the countw function in the third line of code, specifying a space as the delimiter so that it's not thinking that the dollar sign is another delimiter, so this
%do i=1 %to %sysfunc(countw(&ml," "));
as opposed to this
%do i=1 %to %sysfunc(countw(&ml));
However, I'm still getting an error, where the macro isn't resolving correctly:
ERROR 22-322: Expecting a numeric constant.
ERROR 352-185: The length of numeric variables is 3-8.
and I think it may have something to do with the "$" in the value (again, the values in the list in the macro are in the form of "var1$15", etc.). I've come across %bquote and such before, but not sure how to use it, or if it's relevant here.
Since you are using COUNTW in macro code, where everything (including quotes) becomes part of the text, but where leading and trailing blanks are automatically discarded, you must force the macro processor to recognize the blank:
%do i = 1 %to %sysfunc(countw(&ml,%str( )));
I suggest you do three things:
All right--I figured out the issue. It wasn't the COUNTW function, but rather the %SCAN function in the fourth line (at least as far as I can tell). I realized that the COUNTW function as written worked in terms of correctly counting the number of elements in the macro, using the following code to test:
data test; set mydata; a=%sysfunc(countw(&ml," ")); run;
where the variable a was equal to the number of elements correctly delimited by spaces. So then I changed the fourth line to the following:
%let next=%sysfunc(scan(&ml,&i," "));
substituting the %scan macro function with a regular old scan() function using the %sysfunc macro function to make it work with the macro, and adding the space as a third argument. That worked like a charm 🙂
Thanks for helping me work this!
So you told COUNTW() and SCAN() to use spaces and double quotes as the set of delimiters between words.
That only works because the value of the macro variable ML does not contain any double quote characters.
Also why are you using SCAN() instead of %SCAN()? Why add the extra trouble (and instability) of using %SYSFUNC() to call the SAS function SCAN() when it does the same thing as the MACRO function %SCAN() does?
"Also why are you using SCAN() instead of %SCAN()? Why add the extra trouble (and instability) of using %SYSFUNC() to call the SAS function SCAN() when it does the same thing as the MACRO function %SCAN() does?"
Haha, that was just a case of my changing two things at once and thinking it was both things instead of the one thing (in this case, adding the third argument to the %scan/%sysfunc(scan()) function). Thanks again for calling out my silliness here.
"So you told COUNTW() and SCAN() to use spaces and double quotes as the set of delimiters between words.
That only works because the value of the macro variable ML does not contain any double quote characters."
Right--are you implying that there's a more appropriate way to go about it if I'm going to use the countw() and scan() functions? What if I did have double quote characters in the &ml variable value(s)? Thanks! (still learning, obviously...)
What is it that you have in the "magic" macro variable LEN? (A "magic" macro variable is one referenced suddenly in the middle of macro definition that is not an input parameter, not a local macro and has no GLOBAL statement or even any comment that states where is values is expected to come from.)
Assuming you have set it to something like one of these:
%let len=var1$20 var2$30 var3$15 ;
%let len= var1 $20 var2 $30 var3 $15 ;
%let len= var1-var3 $30 ;
Then you should be able to just use it directly in the LENGTH statement.
data mydata;
length &len ;
set mydata;
run;
If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats.
data mydata;
length &len ;
set mydata;
format _character_;
run;
Aha--wow, I was overcomplicating that big time! Reading your reply was a face palm moment, for sure. Well, in any case, I learned how to fix my original code to make it work, so I know how to navigate that now in case I run into something that requires that in the future. Thanks!
@robeldritch wrote:
Aha--wow, I was overcomplicating that big time! Reading your reply was a face palm moment, for sure. Well, in any case, I learned how to fix my original code to make it work, so I know how to navigate that now in case I run into something that requires that in the future. Thanks!
To the macro processor everything is a string. So the quotes that SAS syntax (and 99.99% of other languages) use to indicate a string are not needed. This means that any quotes you use are part of the value of the string.
If you just want a space as the delimiter than use macro quoting to pass the space into the macro function call.
%scan(&len,1,%str( ))
If you have name literals with spaces in them then use the optional fourth argument to tell %SCAN() to ignore the spaces that enclosed in quotes.
1 %put %scan("first name"n 'Last Name'n,1,%str( ),q); "first name"n
"If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats."
Yes, I assume that is what I'm worried about with respect to the inclusion of the format statement here. Your solution of just removing the formats makes sense! I've been burned in the past because I set the length of a variable and then because the format attached to it was shorter, the string was truncated, and I've since always just set the format at the same time I set the length to avoid that problem. What causes formats with lengths different than that specified in a length statement when creating new character variables in this way?
@robeldritch wrote:
"If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats."
Yes, I assume that is what I'm worried about with respect to the inclusion of the format statement here. Your solution of just removing the formats makes sense! I've been burned in the past because I set the length of a variable and then because the format attached to it was shorter, the string was truncated, and I've since always just set the format at the same time I set the length to avoid that problem. What causes formats with lengths different than that specified in a length statement when creating new character variables in this way?
It is caused by someone accidentally attaching the $ format to the variable in the source dataset. SAS itself will do this nonsense also when you use PROC IMPORT or even just libref engines to external databases.
The FORMAT attribute of a variable is an optional attribute. When it is not set then SAS will just display the value in the normal way. So for character variables that is the way the $ format displays it. For numeric that is the way the BEST format displays the value.
When you read in a variable from a dataset (or series of datasets) in a data step SAS will guess you want to use the first non-empty format specification it sees attached to that variable in any of the source datasets. But if you have an explicit FORMAT statement (or FORMAT= option of the ATTRIB statement) in the data step then that will override that initial guess.
Makes sense. Thanks so, so much for taking the time to explain all of this to me! Super appreciative!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.