Desktop productivity for business analysts and programmers

Size limit to compress SAS datasets

Reply
Contributor
Posts: 25

Size limit to compress SAS datasets

What is the size limit to compress the SAS datasets?
if we want to compress efficeintly which size data sets are good?
means I think we no need to compress all small datasets.
Esteemed Advisor
Posts: 5,202

Re: Size limit to compress SAS datasets

There are no special limit for compressed tables, other than ordinary SAS tables. Of course, the larger the original data is, the more you can gain from compressing data. Be aware that you have to pay by using extra CPU cycles when compressing/uncompressing data (which is done every time you read the table).
Another thin is that you will to consider is your table structure. If you have long records, especially with long text fields, chance is high that compression will be effective. After a compressing a table, the SAS log will tell the level of compression. If disk space is not critical, I would say you need approx 50% to see an overall performance gain.

http://support.sas.com/documentation/cdl/en/lrcon/59522/HTML/default/a001002773.htm

/Linus
Data never sleeps
Valued Guide
Posts: 2,111

Re: Size limit to compress SAS datasets

If you have it available, OS compression is both more efficient and more transparent. We typically get 80% compression on SAS datasets using either Windows compression or Solaris 10 compression.
Esteemed Advisor
Posts: 5,202

Re: Size limit to compress SAS datasets

If you are going to use OS/third party compression, make sure that the compression tool works transparently with the file system, that is for a OS user (a SAS session for instance) the SAS table files looks the same (name, extension).

Doc, have noticed any performance differences (CPU cycles, response times) between SAS compressed tables and tables compressed by the OS? What compression tool do you use in Solaris?

/Linus
Data never sleeps
Contributor
Posts: 25

Re: Size limit to compress SAS datasets

Thanks guys :-)
Occasional Contributor
Posts: 14

Re: Size limit to compress SAS datasets

According to my exprience, if you create some huge SAS table, say, 1G, on your desktop, to compress the sas table using COMPRESS=YES option is much faster than if you do not compress.

I guess it is because my CPU is much faster than my harddisk's I/O speed; and unfortainately, harddisk speed is the bottle-neck of nowdays PC. Message was edited by: armor
N/A
Posts: 0

Re: Size limit to compress SAS datasets

SAS compression suits tables with many columns including wide character variables which add up to a lot of empty space. For example addresses, and free-form answer fields.
Much less suitable for "narrow" tables, because compression adds an overhead to each observation, of perhaps 80byte (not sure of the exact number).

PeterC
Contributor
Posts: 47

Re: Size limit to compress SAS datasets

SAS is keeping the balance between coding facility and storage, but time is to bring a revolution on this, or SAS would cost more than 5 times storage space than other databases.

I want to know when will varchar type be supported by SAS data set, instead of char type?
Super Contributor
Super Contributor
Posts: 3,174

Re: Size limit to compress SAS datasets

Reply to qkaiwei : suggest you start another thread if you insist on posing your question about varchar, however I too would respond saying that SAS COMPRESS feature performs adequately albeit at the SAS member level, not the variable / column level.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 47

Re: Size limit to compress SAS datasets

Thank you for your suggestion. but I think why would we talk about the COMPRESS option, the final reason is that we want to compress dataset and reduce storage space. so if VARCHAR is supported by dataset, the COMPRESS option perhaps is meaningless.
Super Contributor
Super Contributor
Posts: 3,174

Re: Size limit to compress SAS datasets

Meaningful response withheld until a new thread is opened.
SAS Employee
Posts: 1

Re: Size limit to compress SAS datasets

In most cases the compress option has given benefits for me.
It has has helped not only in getting the storage reduced by 70 to 90% but also helped get performance gains when retrieving from huge datasets.
It all depends on what the bottleneck is. Whether IO or CPU.

On compress option Vs Varchar:
The data sets created with say 100 records occupies 20 to 30% less space than the RDBMSes with Varchar option. So even if a varchar option is made available still the compress option might be valid.
Ask a Question
Discussion stats
  • 11 replies
  • 210 views
  • 0 likes
  • 8 in conversation