BookmarkSubscribeRSS Feed
WesBarris
Obsidian | Level 7

I have the following macro used for reading a tab-delimited file:

%macro readFile(fileRef=);

   %let maxLen=:$100.;

   infile &fileRef delimiter='09'x dsd;

   input wingband this &maxLen that &maxLen;

%mend readFile;

This works but the variables "this" and "that" maintain a length of 100 characters within sas.

Is there a way to specify a maximum length but have the variables be only as long as necessary

to hold their values?

11 REPLIES 11
FriedEgg
SAS Employee

data foo;

infile cards truncover;

retain mlen_this 0 mlen_that 0;

input wingband this :$100. that $100.;

mlen_this=max(mlen_this,length(this));

mlen_that=max(mlen_that,length(that));

call symput('mlen_this','$' || put(mlen_this,best.));

call symput('mlen_that','$' || put(mlen_that,best.));

drop mlen:;

cards;

1 abcdef ab

2 abc abcd

3 ab abc

4 abcd abcde

5 a a

;

run;

data bar;

length this &mlen_this that &mlen_that;

set foo;

run;

SASJedi
SAS Super FREQ

I really like Fried Egg's approach, but would suggest a couple of small modifications for efficiency's sake:

 data foo;
   /* Add end=last to be able to detect the end of the input data    */
   infile cards truncover end=last;
   retain mlen_this 0 mlen_that 0;
   /* To begin with, make the text variables as log as possible */
   input wingband this :$32767. that :$32767.;
   mlen_this=max(mlen_this,length(this));
   mlen_that=max(mlen_that,length(that));
   if last then do;
   /* CALL SYMPUTXs need only be executed once, after the last record of the input data is read */
   /* Use SYMPUTX instead of SYMPUT to trim leading & trailng spaces from values */
   /* Just save the number, don't concatenate any text  */
      call symputx('mlen_this',mlen_this);
      call symputx('mlen_that',mlen_that);
   end;    
drop mlen:; 
cards; 
1 abcdef ab 
2 abc abcd 
3 ab abc 
4 abcd abcde 
5 a a 
; 
run;

   /* Use the macro variable values to modify the table structure in place */
   /* This should be more efficient that re-writing the data set with a DATA step */
proc sql; 
alter table work.foo
   modify this char(&mlen_this),
          that char(&mlen_that) ;
quit; 
Check out my Jedi SAS Tricks for SAS Users
FriedEgg
SAS Employee

I agree with the modifications here, they are all beneficial to performance.  There is one issue with the exact implementation here that I see however.  The cards statement is unbuffered and cannot use the end option on the infile statement.  You could instead use the parmcards statement.  If this is not the case in a different OS or version of SAS I would be interested in knowing.

SASJedi
SAS Super FREQ

Yep, my bad - CARDS is unbuffered, and the END= option won't work here...

Check out my Jedi SAS Tricks for SAS Users
Tom
Super User Tom
Super User

Why?

Have you considered just using the compress=yes option so that SAS will not waste space storing the extra blanks.

You can either use:

options compress=yes;

To apply it to all future datasets.  Or use it as a dataset option.

data new (compress=yes);

   infile ...;

   input ... ;

run;

FriedEgg
SAS Employee

Since these are large character variables you are storing it would be best to use compress=char.

WesBarris
Obsidian | Level 7

Thanks FriedEgg.  Actually, the values are not necessarily that large.  I was just looking for a way to specify a sufficiently large default value for length but then automatically reduce the length according to the data.  I could determine the length of each variable manually, I was just trying to automate that (which you did show how to do in your example).

WesBarris
Obsidian | Level 7

Thanks.  I didn't know about the compress option.  However, it looks like that uses RLE to compress the value internally.  I was looking for something that would automatically reduce the "length" to the appropriate size.

Tom
Super User Tom
Super User

You will have to read the file more than once.

data length / view=length ;

   infile &fileRef delimiter='09'x dsd end=eof truncover ;

   length x1-x10 $200 ;

   input x1-x10;

   array var x1-x10;

   array len len1-len10;

   do over x; len = lengthm(x); end;

run;

proc summary data=length;

    var len1-len10;

   output out=lengths max= ;

run;

Peter_C
Rhodochrosite | Level 12

Unless you use compression, SAS stores the length you define. Even with the $varying informat, a fixed length is defined for holding the variable that is the maximum width.

Compress=yes is equivalent to Compress=char . These are effective with less overheads on CPU and real time, than Compress=binary

Ksharp
Super User

It is almost not. You need to guess the length of var firstly. But you do not know the length until after browsing

the whole file.

There is a workaround is to use proc import + guessingrow= ,then copy the code generated by SAS

into editor to tailor for yourself.

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 3151 views
  • 5 likes
  • 6 in conversation