xlsb export problem with big number of rows and columns

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

xlsb export problem with big number of rows and columns

DATA work.export;
  DO i=1 to 10000;
    DO j=1 to 50;
      a=i;
      b=j;
      c=i/j;
      d=j/i;
      e=c*d;
      f=e/c;
      g="Hi " || a || b;
      OUTPUT;
    END;
  END;
RUN;

PROC EXPORT  DATA=work.export DBMS=excelcs
  OUTFILE="g:\kcbiar\stp\test.xlsb" REPLACE LABEL;
SHEET="Detail"; RUN;

 

We are using proc export to create a excel file and send it within SAS by e-mail. This works great with xlsx output, but the problem with that format is that the size of the file is very big with a lot of rows an columns (say 300.000 and 20 columns is 81MB even after using ods to zip it is still 52MB). Such a big file we cannot send by e-mail and we don't want to save the data somewhere for the user. We are sending it by e-mail because the query can take up to 2 hours to complete.

 

To make the excel sheet smaller we could use the xlsb format (sample code included).  That works perfect except when we increase the number of row and columns the excel sheet is empty. In the example i used 500.000 rows and 9 columns. The file size is 9kb and the excel sheet is empty and the name of the sheet is not "Detail" but "A266FF2A662E84b639DA".  I thought maybe the old  limitation of around 65000 rows in excel was the problem, so i reduced the number of rows to 35000. Still the same result.  If I reduce the number of rows to 10000 than it works great or if I reduce the number of columns to 2 than I can use 65000 rows.

 

It looks like the combination of rows and columns limits the output.  How is this possible and is there something I can do to fix that.

 

We have SAS 9.4 and offcourse the pcfile server also.


Accepted Solutions
Solution
‎04-12-2016 07:44 AM
Occasional Contributor
Posts: 12

Re: xlsb export problem with big number of rows and columns

a) first of all that was a test the results will never be as big. More columns less rows but still a problem. I tested it also with 360.000 rows and 70 columns and got it to 15mb in size with xlsb. Maybe you learned something.

 

b) stick to data transfer you may. What I was saying before you can call everything data transfer. When you have a end user stick by your plan offering them csv files and for every sheet a different one and tell them to create there own excel sheet (customer is not important).

 

But thx for not helping. Did some more digging and found the real ploblem.

 

http://support.sas.com/kb/50/658.html.

 

 

 

View solution in original post


All Replies
Esteemed Advisor
Posts: 6,634

Re: xlsb export problem with big number of rows and columns

Your problems come from choosing a file format that is NOT suited for data transfer.

DATA work.export;
  DO i=1 to 10000;
    DO j=1 to 50;
      a=i;
      b=j;
      c=i/j;
      d=j/i;
      e=c*d;
      f=e/c;
      g="Hi " || a || b;
      OUTPUT;
    END;
  END;
RUN;

data _null_;
set work.export;
file '$HOME/sascommunity/big_export.csv' dlm=',';
put
  a
  b
  c
  d
  e
  f
  g
;
run;

produces a file that is just 37 MB, after gzip it's just 7.5 MB.

 

Don't use a spreadsheet format for data transfer; two completely different things.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 12

Re: xlsb export problem with big number of rows and columns

First of all it is not data transfer and csv is not the solution. Beside that every excel sheet is data transfer. The people who receive this data don't want a csv file.

I just made an example and the end product is much more sheets than one in the excel file.
Esteemed Advisor
Posts: 6,634

Re: xlsb export problem with big number of rows and columns

a) your data won't get any smaller. A sample CSV I made with 300000 rows and 21 columns of random numerical and character data was 37 MB when gzipped. This is simply too big to send per email, as many mail servers will simply refuse to accept such attachments. For very good reasons.

 

b) No matter what you call it, you try to get data from point A (SAS) to point B (a presentation program on a desktop). This is data transfer, period.

And your (unwisely) chosen medium (Excel file via email) causes you grief b/c of not working right. Throwing a "but I don't like CSV" tantrum like a kid that didn't get her candy at the supermarket won't change a thing. Tell that to your people. The sun rises in the East and settles in the West, if you like it or not.

 

Because of all this, you either need to change your requirements (smaller data) or your tools (different file format and different path for the data).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Solution
‎04-12-2016 07:44 AM
Occasional Contributor
Posts: 12

Re: xlsb export problem with big number of rows and columns

a) first of all that was a test the results will never be as big. More columns less rows but still a problem. I tested it also with 360.000 rows and 70 columns and got it to 15mb in size with xlsb. Maybe you learned something.

 

b) stick to data transfer you may. What I was saying before you can call everything data transfer. When you have a end user stick by your plan offering them csv files and for every sheet a different one and tell them to create there own excel sheet (customer is not important).

 

But thx for not helping. Did some more digging and found the real ploblem.

 

http://support.sas.com/kb/50/658.html.

 

 

 

Esteemed Advisor
Esteemed Advisor
Posts: 7,189

Re: xlsb export problem with big number of rows and columns

I would agree with @KurtBremser, Excel based file formats are not ideal for any type of data transfer.  There are many reasons for this including data integrity problems, in your case your trying to create binary files which will not be cross platform, requires proprietary software etc.

 

CSV is a minimalistic data transfer medium (i.e. it only carries the data), is cross platform, and requires no proprietary program to open, however Excel can natively open CSV files.  So that would be a good suggestion.

 

XML is another one to consider - it has the cross platform, plain text format, but also contains a lot of descriptive information, and can have style information and such like.  XLSX format for instance is a zipped set of XML files (change the .xlsx to .zip and then you will see this).  The downside is that it is quite verbose - your example above where the XLSX file is getting quite big, this is because XML is many times the size of plain format such as CSV.

 

JSon is another file type, more of an object orientated file format.  And there are various other formats out there.

 

I suppose one of the big questions I would raise is "why".  Why are you sending data via email?  Email is not a data transfer medium, it has size limitations, can be a security risk etc.  Data transfer in an ideal world would oocur first by getting the documentation organised and signed off, this would give data structure and transfer arrangements - one of which wiuld be method of transfer.  The data would then be sent via a secure portal, be it SFTP, secure web portal, or something similar. 

Occasional Contributor
Posts: 12

Re: xlsb export problem with big number of rows and columns

[ Edited ]

It is not data transfer. I know json I know csv I know xml and the email is send to the person who starts the stored proces (which got autorisation). For data transfer I use DI-studio.

 

I only descripe a small part of the proces in which something is going wrong.

 

Just look at the problem at hand.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 844 views
  • 1 like
  • 3 in conversation