dataset size

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 81
Accepted Solution

dataset size

Hi All,

How can I reduce the size of a dataset so that it could be transfer or load easily.

Regards

Anand


Accepted Solutions
Solution
‎09-12-2013 07:17 PM
Super User
Posts: 3,108

Re: dataset size

I suggest you check out the COMPRESS dataset and SAS option. We have COMPRESS = BINARY set on by default in our SAS environment because it reduces our dataset sizes by as much as 80 percent. The compression ratio varies a lot but is most relevant to large datasets with many large character columns. Not only does it save space but it improves processing performance as well because of IO reduction. The trade-off is the increased CPU overhead of compression/decompression which is well worth it for large datasets.

View solution in original post


All Replies
Solution
‎09-12-2013 07:17 PM
Super User
Posts: 3,108

Re: dataset size

I suggest you check out the COMPRESS dataset and SAS option. We have COMPRESS = BINARY set on by default in our SAS environment because it reduces our dataset sizes by as much as 80 percent. The compression ratio varies a lot but is most relevant to large datasets with many large character columns. Not only does it save space but it improves processing performance as well because of IO reduction. The trade-off is the increased CPU overhead of compression/decompression which is well worth it for large datasets.

Super Contributor
Posts: 644

Re: dataset size

The COMPRESS= options (which one you chose will depend on whether your data is primarily numeric or character) will usually reduce storage and processing time but will not greatly affect transfers and loads.  Loads of data into SAS will primarily depend on the the rate at which data can be read in, so the characteristics of the external data source will likely dominate.  Transfers between SAS installations may be improved unless you use a transport format or CPORT, in which case compression may have little impact.  Exports of SAS data to other formats or environments are also unlikely to be impacted.  There does not seem to be a standard for passing 'zipped' or compressed data between applications:  in my experience the data has to be unzipped before transfer.

One option to explore in the case of exporting data is replacing repeating character data (eg product names) with short codes (by creating a character format).  At the receiving end the application will need to use a lookup table to expand the codes back to the original values.  Only worth the bother if your data has a significant volume of such data, and the codes are relatively stable.

SAS supports IBM 3270 formats which basically encoded 2 numerals in a single byte, which might be useful for fixed (vs floating) numeric data such as account balances, provided that the application at the receiving end has the ability to restore the original values.

Richard

Frequent Contributor
Posts: 81

Re: dataset size

Thanks SASKiwi and RichardinOz for your valuable answer and explanations. It will help me a lot.

-Anand

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 211 views
  • 3 likes
  • 3 in conversation