There are no special limit for compressed tables, other than ordinary SAS tables. Of course, the larger the original data is, the more you can gain from compressing data. Be aware that you have to pay by using extra CPU cycles when compressing/uncompressing data (which is done every time you read the table).
Another thin is that you will to consider is your table structure. If you have long records, especially with long text fields, chance is high that compression will be effective. After a compressing a table, the SAS log will tell the level of compression. If disk space is not critical, I would say you need approx 50% to see an overall performance gain.
If you are going to use OS/third party compression, make sure that the compression tool works transparently with the file system, that is for a OS user (a SAS session for instance) the SAS table files looks the same (name, extension).
Doc, have noticed any performance differences (CPU cycles, response times) between SAS compressed tables and tables compressed by the OS? What compression tool do you use in Solaris?
SAS compression suits tables with many columns including wide character variables which add up to a lot of empty space. For example addresses, and free-form answer fields.
Much less suitable for "narrow" tables, because compression adds an overhead to each observation, of perhaps 80byte (not sure of the exact number).
Reply to qkaiwei : suggest you start another thread if you insist on posing your question about varchar, however I too would respond saying that SAS COMPRESS feature performs adequately albeit at the SAS member level, not the variable / column level.
Thank you for your suggestion. but I think why would we talk about the COMPRESS option, the final reason is that we want to compress dataset and reduce storage space. so if VARCHAR is supported by dataset, the COMPRESS option perhaps is meaningless.
In most cases the compress option has given benefits for me.
It has has helped not only in getting the storage reduced by 70 to 90% but also helped get performance gains when retrieving from huge datasets.
It all depends on what the bottleneck is. Whether IO or CPU.
On compress option Vs Varchar:
The data sets created with say 100 records occupies 20 to 30% less space than the RDBMSes with Varchar option. So even if a varchar option is made available still the compress option might be valid.