Desktop productivity for business analysts and programmers

Export large SAS dataset

Reply
Frequent Contributor
Posts: 94

Export large SAS dataset

Hello, I am currently using SAS EG6.2 to manipulate/analyze the data of multiple input files (from excel).  The resulting dataset is about 1.2 millon rows.  I need to export that Excel 2010 for others to use for their purposes, such as creating pivot tables.  I use the EG function to export or export in the step, but it takes over a hour.  And sometimes it results in an error, which I think may be due to the fact that it's over 1 millon rows. 1. What is the most efficient way to export such a large dataset? 2.  I currently export as an excel (.xlsx) file.  Is it better to export as .txt, .csv or some other extension?  Are there any differences in the ability to use the file for pivot tables when it's .txt, .csv or .xlsx? Thank you!

Super User
Posts: 7,444

Re: Export large SAS dataset

Using Excel to do transformations of a 1.2 million record dataset when you have SAS looks quite foolish to me. The shortcomings of Excel as a BI tool have been sufficiently discussed around here Smiley Wink

Excel 2010 has a limit of 1M rows, 1048576, so you fail there. Regardless of file format.

You can reduce the time for the transfer by reducing your row size (trim character vars, drop unused columns).

But keep in mind that Excel is not a tool for tables, but for _spreadsheets_. Those sheets should rarely expand beyond your screen size, because they tend to be uncontrollable.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 11,134

Re: Export large SAS dataset

If I had that large of a set and some one thinks they want to do pivot tables I would ask which tables, make them in SAS and then export the results.

Super User
Super User
Posts: 7,720

Re: Export large SAS dataset

I concur with the above.  A pivot table is really only a glorified flitering/summarizing tool.  Generate the required summaries and then export the results to Excel.

Respected Advisor
Posts: 3,156

Re: Export large SAS dataset

If your site has SAS OLAP Cube and SAS MS Add-in license, then it becomes straightforward in term of making/exporting  Excel pivot tables, if your users still want to use Pivot tables instead of OLAP cubes.

Frequent Contributor
Posts: 94

Re: Export large SAS dataset

I agree with all the comments above - it's preaching to the choir.  It's an big uphill battle for me to change the mindset of a large corporation to stop using excel for reporting.  As reeza stated in one of her posting...excel is a spreadsheet calculator. Majority of the people do not know SAS and I am the dept sole user.  They run a lot of adhoc reports and I cannot keep up with those and having to constantly change my codes to fit the needs.  So people said to give them the raw data....I am doing my best with the situation. I am generated required summaries but then this person wants this, this other person want that....and back to where I started from.  I posted the original post to see if anyone have an insight into exporting a large dataset - that's all.

Trusted Advisor
Posts: 2,114

Re: Export large SAS dataset

Jen,

If your audience is looking for %, this data are big enough that a sample might be sufficient.  You would need to do some education as to the limitations of a sample, but it is likely sufficient to support decisions.

Respected Advisor
Posts: 4,135

Re: Export large SAS dataset

There are still a lot of these Excel shops around and they all have the same problems. The overall costs (work) are quite significant but don't show as a single line item in a balance sheet, data quality is questionable and different departments sometimes produce different numbers and then spend a lot of time investigating and explaining the differences,....

On the other hand your users need to do some data analysis and "play" with the data so they can't give you clear specs for reports. And as their requirements and questions change frequently they also need constantly different data extracts from you. Excel is for them the only tool available to play with data - so that's what they are used to and use.

Using SAS tools I can think of two ways to better the situation:

- SAS Add In for MS Office (SAS AMO). This would allow you to create a bunch of stored processes which users then can call, apply some filters and then get the data directly into Excel by themselves or also use some of SAS' reporting and analytic functions directly.

- SAS Visual Analytics (SAS VA): If there is an opportunity to create a set of granular de-normalised analytical tables which could serve as basis for the majority of data requirements, then using VA would be ideal. I've seen business users with zero SAS knowledge and no programming experience learning in two days of how to retrieve and analyse data and how to build reports. I was pretty impressed by this and it was the closest I've seen so far in "self serve BI".

Super User
Posts: 19,165

Re: Export large SAS dataset

If your users are Excel based and you're the sole SAS user sometimes you have to work with that, even though it sucks.

I'd export the data to an Access DB and then have them connect to the data via DATA>External Data Sources. I think that will still work if the number of rows is over 1 million, but I'm not in an environment to test it at the moment.

Valued Guide
Posts: 2,177

Re: Export large SAS dataset

Ok

so will someone please tell me what the flexible reporting alternative to excel might be?

just someone who wants to find that alt.

Super User
Posts: 7,444

Re: Export large SAS dataset

You have flexible in SAS.

If you meant "simple enough that any monkey can produce shiny reports", you need Excel. But don't expect the results to be correct. Or rise up to the scrutiny of compliance inspectors w/regards to ISO standards, SOX, BASEL II, ....

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 4,135

Re: Export large SAS dataset

Nothing wrong about Excel as such. It's only about finding the right balance and mix for the tasks at hand and for data aggregation, analysis & reporting on 1M+ source rows Excel is eventually not the ideal tool.

I'm very much for empowering end users and believe that's exactly what SAS VA in the SAS world does when it comes to bigger data volumes which Excel can't handle anymore.

Super User
Super User
Posts: 7,720

Re: Export large SAS dataset

Teach a man to sharpen a pitchfork and he will ...

Fact of the matter is, empowering end user is all well and good until such time as something goes wrong.  E.g. someone high up asking why there are mistakes, figures inaccurately shown, global financial crisis?  In my role, as I am sure in most others, we are strongly controlled by procedure and validation.  Its hard, explaining the concept of validation to someone new, what can we validate on importing a spreadsheet was one question recently.  From my side everything, down to the last individual bit should be checked and double checked (independently in most cases) with full documentation, otherwise as a well know statement goes, it didn't happen.

VA is SAS's new Excel-like which was demonstrated at Phuse this year.   It all looks well and dandy but still falls into the category of putting information in the hands of those not following the same strict processes.  I would expect after a few months of working with it you will get an urgent question back on some documents which have been supplied to someone asking for some changes and when you look at it, you will not recognize the data, output etc. and then spend the next 6 months trying to re-engineer it only to find half a dozen errors in it that you then need to try to explain.

I personally would either:

a) dump the data to CSV and mention the limitations of Excel.

b) Request specs for outputs.  If they don't know what they want then how can they achieve it.

Respected Advisor
Posts: 4,135

Re: Export large SAS dataset

If anything then VA in it's core is a new way of OLAP (as a process not a storage concept - and with newer VA versions it's OLAP++). VA allows a lot of flexibility to the end user but unlike in Excel you can't do "anything you want".

Everything is calculated "on-the-fly" so there are no 10 steps with intermediary results and manual changes to data like possible in Excel but the same base data will be used. This also makes results repeatable.

Frequent Contributor
Posts: 94

Re: Export large SAS dataset

Thanks for the response. Re: your option B...they don't know what they want until they decide it's needed.  :-) I am in situation like many others...and I strongly dislike excel for number crunching, analysis, etc...

Ask a Question
Discussion stats
  • 17 replies
  • 1979 views
  • 1 like
  • 11 in conversation