Get daily data with gazillion row/column, need to compress and save space.
Any way to compress sas dataset?! Also to read-in from compressed dataset directly?!
you can use compress option.
- compress= dataset option
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/ledsoptsref/n014hy7167t2asn1j7qo99qv16wa.htm
- compress= system option
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lesysoptsref/n0uhpz2l79vy0qn12x3ifcvjzsyt.htm
options compress=yes;/* this apply all datasets */
data test;
set sashelp.class(compress=yes);/* this apply this dataset only */
run;
Seem SAS complains.
41301
41302 options compress=yes;/* this apply all datasets */
41303
41304 data test;
41305 set sashelp.class(compress=yes);/* this apply this dataset only */
--------
70
WARNING 70-63: The option COMPRESS is not valid in this context. Option ignored.
41306 run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.TEST has 19 observations and 5 variables.
NOTE: Compressing data set WORK.TEST increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
You do not put the COMPRESS= dataset option on the dataset being read. That does not make any sense. SAS can tell from the header of the file being read whether or not it was written using compression. It is when you are writing the dataset that you need to tell SAS what compression method to use.
data test(compress=yes);
set sashelp.class;
run;
It is quite common for SAS installations to have the COMPRESS = YES or COMPRESS = BINARY options switched on by default so all SAS datasets stored on disk are compressed unless the option is turned off for specific datasets.
Please note that SAS datasets are not compressed once loaded into memory. The one exception to this is with SAS VA in-memory datasets where they can be optionally compressed. This can significantly degrade performance so is not normally recommended.
when read-in, anything additional needed?
Are you getting the data as datasets? or text files, like a csv file? Or some other format, like XLSX?
SAS has a compress option that can reduce the size the SAS datasets take on the disk. You can use these like normal SAS datasets. There is a little bit of extra CPU time to expand it, but normally it is more than offset by the reduced Input/Output time. But note the level of compress is much smaller than you would get with some file level compression tool, like ZIP or GZIP.
If the file is a simple text file then use ZIP or GZIP to compress it. You can still read either format from SAS by using the ZIP filename engine.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.