BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
psrajput
Obsidian | Level 7

Hi,

 

I want to convert a SAS Dataset into a pipe-delimited file. The file has 3 MM obs and 10k vars.

And, I am facing two challenges:

 

1. SAS doesn't convert the header after a certain number of columns.

2. Certain rows where the fields have missing values at the end when loading the pipe delimited we saw that since there are many missing values at the end, so while conversion, it's not adding pipes for them and thus number of obs in those rows becoming less than what it should be.

 

Is there a way to overcome these?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Wilbram-SAS
SAS Employee

The provided data step is just for testing purposes to show that proc export in combination with a filename statement containing LRECL can actually deal with really wide data. It uses an array and do loops in a data set to generate fake date using just a few lines of SAS code. And no, you don't need it in your program.

 

The only thing you need to change in your program is to add the filename statement just above your proc export and change the value of the outfile= option. So something like:

filename widefile "/abc/file.txt" LRECL=<value large enough to hold all info>;
proc export data=lib.file
   outfile=widefile
   dbms=dlm /* Corrected a type here, you had dibms :-) */
   replace;
   delimiter='|';
run;

If you want to use the number of variables in your input table to calculate the needed LRECL so it can contain all the data values and a delimiter, then you can use data step functions to retrieve the number of variables and the logical record length of the data, calculate what you think would be the number of bytes needed, store it in a SAS macro variable and use that SAS macro variable in your filename statement. That code would look something like this: 

data _null_;
   dsid = open ("lib.file");
   lrecl_data = attrn(dsid,"LRECL"); /* See SAS doc for ATTRN function for list of table attributes that can be retrieved */
   nvars = attrn(dsid,"NVARS");
   rc = close(dsid);

   lrecl_filename = lrecl_data + nvars*1 ; /* adding a byte to hold a single delimiter between each each variable */ 

   call symput("LRECL_FILENAME_MACROVAR",put(lrecl_filename,best.));  /* Create SAS Macro variable and use put function to convert num to char and avoid notes in log */
   put _all_; /* debug statement, is optional */
run;

filename widefile "/abc/file.txt" lrecl=&LRECL_FILENAME_MACROVAR; 

proc export data=lib.file
   outfile=widefile
   dbms=dlm /* Corrected a type here, you had dibms :-) */
   replace;
   delimiter='|';
run;

View solution in original post

7 REPLIES 7
Shmuel
Garnet | Level 18

Please post the code you have run;

psrajput
Obsidian | Level 7

It's a simple proc export.

 

proc export data=lib.file

outfile=/abc/file.txt

dibms=dlm replace;

delimiter='|';

run;

andreas_lds
Jade | Level 19

With 10k vars you need 10k chars to separate variables, leaving only 2 chars per variable for the values, because afaik the text-buffer can only handle up to 32k chars. Having that many variable could be an indicator, that the data should have been re-designed before trying anything else.

 

 

Kurt_Bremser
Super User

You cannot use PROC EXPORT for this. It tries to handle the header with a single data step variable, which can't be more than 32k characters.

So you need to create your own data step, and will need to use call execute in a data step reading SASHELP.VCOLUMN.

 

But why do you have 10k variables at all? To me that suggests a massive design failure.

Wilbram-SAS
SAS Employee

Have you tried using a filename statement and specifying LRECL=?

 

Here is some sample code to play with. Does this help?

Btw. this code runs fine using SAS 9.4M6 (64-bit) on Windows 10, and results were inspected using Notepad++.

 

data widetable;
    array vv (10000) $ 8;
    do obs=1 to 10;
        do varnum=1 to 10000;
             vv[varnum]=put(1e6*obs+varnum,z8.);
        end;
        output;
     end;
     drop varnum obs;
run;
/* Truncation appears in output file as lrecl defaults to 32K, explicitly specify value for lrecl */
filename widefile "c:\temp\widetable.dlm" lrecl=910000;
proc export data=widetable outfile=widefile 
    dbms=DLM
    replace;
    delimiter="|";
    putnames=YES;
run;

Here is what the resulting SAS log says:

NOTE: 11 records were written to the file WIDEFILE.
The minimum record length was 68893.
The maximum record length was 89999.

psrajput
Obsidian | Level 7
Thanks snlwih, This looks quite close to what I am looking for. I wanted to use lrecl to force SAS to adhere to the number of columns while exporting.

Still, I have a few questions:

What data step is doing here? Is it important?
Can't we directly use lrecl somehow? So, that we are not changing the formats or values?
Wilbram-SAS
SAS Employee

The provided data step is just for testing purposes to show that proc export in combination with a filename statement containing LRECL can actually deal with really wide data. It uses an array and do loops in a data set to generate fake date using just a few lines of SAS code. And no, you don't need it in your program.

 

The only thing you need to change in your program is to add the filename statement just above your proc export and change the value of the outfile= option. So something like:

filename widefile "/abc/file.txt" LRECL=<value large enough to hold all info>;
proc export data=lib.file
   outfile=widefile
   dbms=dlm /* Corrected a type here, you had dibms :-) */
   replace;
   delimiter='|';
run;

If you want to use the number of variables in your input table to calculate the needed LRECL so it can contain all the data values and a delimiter, then you can use data step functions to retrieve the number of variables and the logical record length of the data, calculate what you think would be the number of bytes needed, store it in a SAS macro variable and use that SAS macro variable in your filename statement. That code would look something like this: 

data _null_;
   dsid = open ("lib.file");
   lrecl_data = attrn(dsid,"LRECL"); /* See SAS doc for ATTRN function for list of table attributes that can be retrieved */
   nvars = attrn(dsid,"NVARS");
   rc = close(dsid);

   lrecl_filename = lrecl_data + nvars*1 ; /* adding a byte to hold a single delimiter between each each variable */ 

   call symput("LRECL_FILENAME_MACROVAR",put(lrecl_filename,best.));  /* Create SAS Macro variable and use put function to convert num to char and avoid notes in log */
   put _all_; /* debug statement, is optional */
run;

filename widefile "/abc/file.txt" lrecl=&LRECL_FILENAME_MACROVAR; 

proc export data=lib.file
   outfile=widefile
   dbms=dlm /* Corrected a type here, you had dibms :-) */
   replace;
   delimiter='|';
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2252 views
  • 2 likes
  • 5 in conversation