BookmarkSubscribeRSS Feed
sri1
Obsidian | Level 7

Hi all,

 

I have the following sample code and want to create the variable "Name" with maximum length of strings in variable name instead of default $200 and I would like the length to be updated when the dataset changes.Is there anyway I can do in datastep?

 

 

data attrs;
dsid=open("sashelp.class", "i"); 
n_vars=attrn(dsid, "nvars");
do i=1 to n_vars;
Name=varname(dsid, i);
Type=vartype(dsid, i);

output;

end;

rc=close(dsid);

run;

 

Your solutions are highly appreciated.

 

Thanks

11 REPLIES 11
rudfaden
Lapis Lazuli | Level 10

Not that elegant. But it works.

Save the max length of name in a macro.

data attrs;
dsid=open("sashelp.class", "i"); 
n_vars=attrn(dsid, "nvars");
do i=1 to n_vars;
Name=varname(dsid, i);
Type=vartype(dsid, i);

output;
call symput('len',max(length(name)));

end;
rc=close(dsid);
run;

Alter the length of the column.

proc sql;
alter table attrs
  modify Name char(&len);
quit;
mkeintz
PROC Star

I would assign a fixed length of $32 to the variable name, since that is the maximum allowable  length of any sas variable name.

 

See Rules for most SAS names 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

Why does it matter what length the variable has, as long as the length is not too SHORT?

 

proc sql noprint;
  select max(length(NAME)) into :max_length trimmed
  from sashelp.class
  ;
quit;

data class ;
  length name $&max_length. ;
  set sashelp.class;
run;
rudfaden
Lapis Lazuli | Level 10

If a variable is to long it will use unessesary space and SAS will run slower. Not a big deal if you have small dataset. But if you have billions of rows it matters.

mkeintz
PROC Star

Question:   Are you trying the determine the longest length of all the varnames (that is my assumption)?   Or are you trying to determine the longest value of a variable called NAME?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
sri1
Obsidian | Level 7

@mkeintz  You are right.I am trying to  determine the longest length of all the varnames

sri1
Obsidian | Level 7

@TomThanks for your reply.I am trying to determine the maximum length of all the varnames but not NAME column of sashelp.class

Tom
Super User Tom
Super User

@sri1 wrote:

@TomThanks for your reply.I am trying to determine the maximum length of all the varnames but not NAME column of sashelp.class


Huh?  Why?  What are you going the change once you know that information?

The maximum length of a SAS variable name is 32 bytes.

 

Anyway the method to find the max is the same, only use as input the DICTIONARY metadata tables instead of the actual data.

 

sri1
Obsidian | Level 7

@Tom I know that  maximum length of a SAS variable name is 32 bytes but to reduce the storage space.

mkeintz
PROC Star

@sri1 

 

I can see the benefit of reducing the length of a variable from $200 to $32, just as a matter of good housekeeping, if nothing else. And if you were to load your resulting data set  into memory (via the SASFILE statement or via a hash  object) you could save a meaningful amount of memory for a large collection.

 

But  I don't see nearly as much value in trying to cut down from $32, to (say) $15 based on the varnames actually in use.  Using $32 guarantees you always will have allocated enough space, and also means you don't have to inspect all the metadata in order to establish the absolute minimum necessary length.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
TomKari
Onyx | Level 15

Consider using compression. I had a similar situation where I couldn't predict character variable length, and some outliers were VERY long, and when I used the compression option it reduced the overhead to a very manageable amount.

 

Tom

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 5303 views
  • 1 like
  • 5 in conversation