DATA Step, Macro, Functions and more

Reading tab-delimited data using a macro

Reply
Contributor
Posts: 44

Reading tab-delimited data using a macro

I have the following macro used for reading a tab-delimited file:

%macro readFile(fileRef=);

   %let maxLen=:$100.;

   infile &fileRef delimiter='09'x dsd;

   input wingband this &maxLen that &maxLen;

%mend readFile;

This works but the variables "this" and "that" maintain a length of 100 characters within sas.

Is there a way to specify a maximum length but have the variables be only as long as necessary

to hold their values?

Trusted Advisor
Posts: 1,300

Reading tab-delimited data using a macro

data foo;

infile cards truncover;

retain mlen_this 0 mlen_that 0;

input wingband this :$100. that $100.;

mlen_this=max(mlen_this,length(this));

mlen_that=max(mlen_that,length(that));

call symput('mlen_this','$' || put(mlen_this,best.));

call symput('mlen_that','$' || put(mlen_that,best.));

drop mlen:;

cards;

1 abcdef ab

2 abc abcd

3 ab abc

4 abcd abcde

5 a a

;

run;

data bar;

length this &mlen_this that &mlen_that;

set foo;

run;

SAS Employee
Posts: 104

Re: Reading tab-delimited data using a macro

I really like Fried Egg's approach, but would suggest a couple of small modifications for efficiency's sake:

 data foo;
   /* Add end=last to be able to detect the end of the input data    */
   infile cards truncover end=last;
   retain mlen_this 0 mlen_that 0;
   /* To begin with, make the text variables as log as possible */
   input wingband this :$32767. that :$32767.;
   mlen_this=max(mlen_this,length(this));
   mlen_that=max(mlen_that,length(that));
   if last then do;
   /* CALL SYMPUTXs need only be executed once, after the last record of the input data is read */
   /* Use SYMPUTX instead of SYMPUT to trim leading & trailng spaces from values */
   /* Just save the number, don't concatenate any text  */
      call symputx('mlen_this',mlen_this);
      call symputx('mlen_that',mlen_that);
   end;    
drop mlen:; 
cards; 
1 abcdef ab 
2 abc abcd 
3 ab abc 
4 abcd abcde 
5 a a 
; 
run;

   /* Use the macro variable values to modify the table structure in place */
   /* This should be more efficient that re-writing the data set with a DATA step */
proc sql; 
alter table work.foo
   modify this char(&mlen_this),
          that char(&mlen_that) ;
quit; 
Trusted Advisor
Posts: 1,300

Reading tab-delimited data using a macro

I agree with the modifications here, they are all beneficial to performance.  There is one issue with the exact implementation here that I see however.  The cards statement is unbuffered and cannot use the end option on the infile statement.  You could instead use the parmcards statement.  If this is not the case in a different OS or version of SAS I would be interested in knowing.

SAS Employee
Posts: 104

Re: Reading tab-delimited data using a macro

Yep, my bad - CARDS is unbuffered, and the END= option won't work here...

Super User
Super User
Posts: 6,502

Reading tab-delimited data using a macro

Why?

Have you considered just using the compress=yes option so that SAS will not waste space storing the extra blanks.

You can either use:

options compress=yes;

To apply it to all future datasets.  Or use it as a dataset option.

data new (compress=yes);

   infile ...;

   input ... ;

run;

Trusted Advisor
Posts: 1,300

Reading tab-delimited data using a macro

Since these are large character variables you are storing it would be best to use compress=char.

Contributor
Posts: 44

Reading tab-delimited data using a macro

Thanks FriedEgg.  Actually, the values are not necessarily that large.  I was just looking for a way to specify a sufficiently large default value for length but then automatically reduce the length according to the data.  I could determine the length of each variable manually, I was just trying to automate that (which you did show how to do in your example).

Contributor
Posts: 44

Reading tab-delimited data using a macro

Thanks.  I didn't know about the compress option.  However, it looks like that uses RLE to compress the value internally.  I was looking for something that would automatically reduce the "length" to the appropriate size.

Super User
Super User
Posts: 6,502

Re: Reading tab-delimited data using a macro

You will have to read the file more than once.

data length / view=length ;

   infile &fileRef delimiter='09'x dsd end=eof truncover ;

   length x1-x10 $200 ;

   input x1-x10;

   array var x1-x10;

   array len len1-len10;

   do over x; len = lengthm(x); end;

run;

proc summary data=length;

    var len1-len10;

   output out=lengths max= ;

run;

Valued Guide
Posts: 2,175

Reading tab-delimited data using a macro

Unless you use compression, SAS stores the length you define. Even with the $varying informat, a fixed length is defined for holding the variable that is the maximum width.

Compress=yes is equivalent to Compress=char . These are effective with less overheads on CPU and real time, than Compress=binary

Super User
Posts: 9,682

Re: Reading tab-delimited data using a macro

It is almost not. You need to guess the length of var firstly. But you do not know the length until after browsing

the whole file.

There is a workaround is to use proc import + guessingrow= ,then copy the code generated by SAS

into editor to tailor for yourself.

Ksharp

Ask a Question
Discussion stats
  • 11 replies
  • 746 views
  • 5 likes
  • 6 in conversation