I read that mentioning COMPRESS=YES/CHAR/BINARY will save space but this will also casue a delay on CPU as it process the compressed dataset.
So only compressing the final dataset in a program will be a nice idea rather than specifying COMORESS=YES in the begining of the program as it compress as work dataset and cause delays CPU process.
Depends on the OS and available temporary/permanent disk storage available?
You have the OPTIONS COMPRESS= setting to control all SAS members' behavior and then you have the "dataset option" with the COMPRESS= setting when creating an individual SAS member.
For Windows, *nix, my experience has been to use COMPRESS=YES as the default OPTIONS setting, however for IBM mainframe, use a default-setting of OPTIONS COMPRESS=NO if you are not DASD-constrained for your WORK allocation, and then set OPTIONS COMPRESS=YES either as the default-setting just prior to outputting your permanent SAS data library files, or consider an individual-case basis where you may find some larger, long-retention files may give you back 40-60% return with COMPRESS=YES if you do not expect a CPU constraint situation.
Suggested Google advanced search arguments, this topic / post:
for tables with wide columns holding lots of empty space, I would write with compress=yes
otherwise use platform compression
- zOS migration is inconvenient, but effective
- windows compression seems superior to SAS compression but seems limited to local drives for most client sites where I have consulted.
More clearly stated, z/OS (DFSMS-managed, striped datasets) compression is not supported except for "SAS sequential-access formatted" data sets (tape or DASD allocated). And so with your typical SAS z/OS (IBM mainframe environment) direct-access bound data libraries, suggest using COMPRESS=YES for large-observation/wide-columns datasets, as mentioned by Peter.C in his prior post, where you stand to benefit up to 50% or more on space-savings, again presuming you are not CPU-limited. Also, also consider a more optimum BLKSIZE setting in SAS z/OS rather than the SAS system default of 6144.
If using SAS on z/OS (the OS platform for SAS was not stated in the original post), suggest the OP review the SAS Companion documentation - in fact, the "performance topic" for any OS platform is a reasonable reading topic in my perspective.
For all of our production SAS jobs we set COMPRESS = YES as a SAS option and COMPRESS = BINARY on specific datasets that are approaching 1GB in size. BINARY typically compresses 20 percent smaller than CHARACTER for large, wide datasets. Our Windows server processes compressed datasets faster than uncompressed because IO is reduced much more than the overhead of the processor compressing/uncompressing the data.
At the moment we also have Windows compression turned on because we are very space-constrained. Even with "double" compression SAS performance is not adversely impacted.
Most Windows SAS servers I have used tend to get IO bound more often than processor-bound. If this is the case for you then dataset compression can significantly improve processing times as well as saving disk space.