BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
robeldritch
Obsidian | Level 7

I'm trying to figure out how to specify lengths of new character variables (i.e., that don't already exist in the dataset) whose names are stored in a macro. The way I stored the list of "words" in the macro looks like this: var1$20 var2$30 var3$15 [and so on], in anticipation of using them in a length statement (and format statement). I guess I can't do something like the following (where "len" is the name of the macro that contains the values like above):

 

%macro x(x);
	%local i next;
	%do i=1 %to %sysfunc(countw(&len));
	   	%let next=%scan(&len, &i);
			data mydata; set mydata; 
				where x=&x; 
				length &next; format &next.;
			run;
	%end;
%mend x; %x(1)

Is there something I can manipulate in this code to make it work, or is there a better way I can specify lengths of new character variables whose names are stored in a macro, where the length would be some numeric value associated with each variable-to-be? 

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

What is it that you have in the "magic" macro variable LEN?  (A "magic" macro variable is one referenced suddenly in the middle of macro definition that is not an input parameter, not a local macro and has no GLOBAL statement or even any comment that states where is values is expected to come from.)

 

Assuming you have set it to something  like one of these:

%let len=var1$20 var2$30 var3$15 ;
%let len= var1 $20 var2 $30 var3 $15 ;
%let len= var1-var3 $30 ;

Then you should be able to just use it directly in the LENGTH statement.

data mydata;
  length &len ;
  set mydata; 
run;

If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats.

data mydata;
  length &len ;
  set mydata; 
  format _character_;
run;

 

View solution in original post

14 REPLIES 14
Kurt_Bremser
Super User

When a variable is present in an incoming dataset, the data step compiler will take its length when the SET statement is compiled. If you want to force another length, you need to have the LENGTH statement before the SET.

robeldritch
Obsidian | Level 7

Hi Kurt,

 

Thanks for your input. I think my main issue here is getting the macro values to resolve correctly in the length statement. One manipulation I just made was add a second argument to the countw function in the third line of code, specifying a space as the delimiter so that it's not thinking that the dollar sign is another delimiter, so this

%do i=1 %to %sysfunc(countw(&ml," "));

as opposed to this

%do i=1 %to %sysfunc(countw(&ml));

However, I'm still getting an error, where the macro isn't resolving correctly:

 

ERROR 22-322: Expecting a numeric constant.

ERROR 352-185: The length of numeric variables is 3-8.

 

and I think it may have something to do with the "$" in the value (again, the values in the list in the macro are in the form of "var1$15", etc.). I've come across %bquote and such before, but not sure how to use it, or if it's relevant here. 

Kurt_Bremser
Super User

Since you are using COUNTW in macro code, where everything (including quotes) becomes part of the text, but where leading and trailing blanks are automatically discarded, you must force the macro processor to recognize the blank:

%do i = 1 %to %sysfunc(countw(&ml,%str( )));

I suggest you do three things:

  • write a data step which successfully sets the length for a single case, without any macro elements
  • write a macro which loops through your list, without a data step, which only displays the single values with %PUT, to verify the pure macro code
  • finally, merge these two parts together
robeldritch
Obsidian | Level 7

All right--I figured out the issue. It wasn't the COUNTW function, but rather the %SCAN function in the fourth line (at least as far as I can tell). I realized that the COUNTW function as written worked in terms of correctly counting the number of elements in the macro, using the following code to test:

data test; set mydata;
	a=%sysfunc(countw(&ml," "));
run;

where the variable a was equal to the number of elements correctly delimited by spaces. So then I changed the fourth line to the following:

%let next=%sysfunc(scan(&ml,&i," "));

substituting the %scan macro function with a regular old scan() function using the %sysfunc macro function to make it work with the macro, and adding the space as a third argument. That worked like a charm 🙂

 

Thanks for helping me work this!

Tom
Super User Tom
Super User

So you told COUNTW() and SCAN()  to use spaces and double quotes as the set of delimiters between words.

That only works because the value of the macro variable ML does not contain any double quote characters.

 

Also why are you using SCAN() instead of %SCAN()?  Why add the extra trouble (and instability) of using %SYSFUNC() to call the SAS function SCAN() when it does the same thing as the MACRO function %SCAN() does?

robeldritch
Obsidian | Level 7

"Also why are you using SCAN() instead of %SCAN()?  Why add the extra trouble (and instability) of using %SYSFUNC() to call the SAS function SCAN() when it does the same thing as the MACRO function %SCAN() does?"

 

Haha, that was just a case of my changing two things at once and thinking it was both things instead of the one thing (in this case, adding the third argument to the %scan/%sysfunc(scan()) function). Thanks again for calling out my silliness here. 

robeldritch
Obsidian | Level 7

"So you told COUNTW() and SCAN()  to use spaces and double quotes as the set of delimiters between words.

That only works because the value of the macro variable ML does not contain any double quote characters."

 

Right--are you implying that there's a more appropriate way to go about it if I'm going to use the countw() and scan() functions? What if I did have double quote characters in the &ml variable value(s)? Thanks! (still learning, obviously...)

Astounding
PROC Star
It looks like your plan is to run a DATA step for each new variable. If your data set is small, that should be OK. But otherwise, that is horribly inefficient. Design code that works with a single DATA step (there are other efficient options as well). Then use macro language to generate your more efficient code.

For example, you may not need to set formats at all. And lengths could be set with a short simple program:

data want;
length &len;
set mydata;
run;
Tom
Super User Tom
Super User

What is it that you have in the "magic" macro variable LEN?  (A "magic" macro variable is one referenced suddenly in the middle of macro definition that is not an input parameter, not a local macro and has no GLOBAL statement or even any comment that states where is values is expected to come from.)

 

Assuming you have set it to something  like one of these:

%let len=var1$20 var2$30 var3$15 ;
%let len= var1 $20 var2 $30 var3 $15 ;
%let len= var1-var3 $30 ;

Then you should be able to just use it directly in the LENGTH statement.

data mydata;
  length &len ;
  set mydata; 
run;

If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats.

data mydata;
  length &len ;
  set mydata; 
  format _character_;
run;

 

robeldritch
Obsidian | Level 7

Aha--wow, I was overcomplicating that big time! Reading your reply was a face palm moment, for sure. Well, in any case, I learned how to fix my original code to make it work, so I know how to navigate that now in case I run into something that requires that in the future. Thanks!

Tom
Super User Tom
Super User

@robeldritch wrote:

Aha--wow, I was overcomplicating that big time! Reading your reply was a face palm moment, for sure. Well, in any case, I learned how to fix my original code to make it work, so I know how to navigate that now in case I run into something that requires that in the future. Thanks!


To the macro processor everything is a string.  So the quotes that SAS syntax (and 99.99% of other languages) use to indicate a string are not needed.  This means that any quotes you use are part of the value of the string.

 

If you just want a space as the delimiter than use macro quoting to pass the space into the macro function call.

%scan(&len,1,%str( ))

If you have name literals with spaces in them then use the optional fourth argument to tell %SCAN() to ignore the spaces that enclosed in quotes.

   1  %put %scan("first name"n 'Last Name'n,1,%str( ),q);
"first name"n
robeldritch
Obsidian | Level 7

"If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats."

 

Yes, I assume that is what I'm worried about with respect to the inclusion of the format statement here. Your solution of just removing the formats makes sense! I've been burned in the past because I set the length of a variable and then because the format attached to it was shorter, the string was truncated, and I've since always just set the format at the same time I set the length to avoid that problem. What causes formats with lengths different than that specified in a length statement when creating new character variables in this way?

Tom
Super User Tom
Super User

@robeldritch wrote:

"If you are worried that you might end up with VAR1 defined with a length of $20 and a format of $10. attached to it that would print only the first half of the value then just remove the formats."

 

Yes, I assume that is what I'm worried about with respect to the inclusion of the format statement here. Your solution of just removing the formats makes sense! I've been burned in the past because I set the length of a variable and then because the format attached to it was shorter, the string was truncated, and I've since always just set the format at the same time I set the length to avoid that problem. What causes formats with lengths different than that specified in a length statement when creating new character variables in this way?


It is caused by someone accidentally attaching the $ format to the variable in the source dataset.  SAS itself will do this nonsense also when you use PROC IMPORT or even just libref engines to external databases.

 

The FORMAT attribute of a variable is an optional attribute.  When it is not set then SAS will just display the value in the normal way.  So for character variables that is the way the $ format displays it.  For numeric that is the way the BEST format displays the value.

 

When you read in a variable from a dataset (or series of datasets) in a data step SAS will guess you want to use the first non-empty format specification it sees attached to that variable in any of the source datasets.  But if you have an explicit FORMAT statement (or FORMAT= option of the ATTRIB statement) in the data step then that will override that initial guess.

robeldritch
Obsidian | Level 7

Makes sense. Thanks so, so much for taking the time to explain all of this to me! Super appreciative!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 6763 views
  • 3 likes
  • 4 in conversation