BookmarkSubscribeRSS Feed

Is your SAS library a disk hog? Here's how to put it on a diet.

Started ‎03-02-2022 by
Modified ‎03-02-2022 by
Views 1,941

I originally wrote a macro, %SQUEEZE (http://support.sas.com/kb/24/804.html) to compute the optimal length of every variable in a SAS dataset and thereby reduce the disk space required for a specified dataset. No information is lost in the process, e.g., numerical accuracy is not reduced, and long character variables that contain mostly trailing blanks are redefined in length to contain the longest string of nonblank characters.

 

Then I created another macro, %SQZ_LIBRARY, that invokes the %SQUEEZE macro on all of the SAS datasets contained in a SAS data library. It is available at https://www.lexjansen.com/wuss/2011/posters/Papers_Bettinger_R_74821.pdf.

 

Significant reductions in disk space may be achieved by applying the %SQZ_LIBRARY macro to SAS data libraries as a data management tool.

 

I have attached the PDF document describing the %SQUEEZE and %SQZ_LIBRARY macros. The SQZ_LIBRARY.sas file contains the %SQUEEZE and %SQZ_LIBRARY programs.

Comments

I assume the optimal is on the existing data. As time goes, even the dataset/column names are untouched, the length could be changing...

is there risk when you are to stack datasets up?!

or there is a waring, or backup alike?!

Hi,

 

Thanks for sharing, this is a much needed tool to improve the overall performance, spare some unneeded storage volume and streamline the data model as well - all in one pass ! I checked the syntax of the %SQUEEZE macro and I wondered about a safety feature to be considered : when you schedule a record in your favorite TV Box - like in the ancient VHS tape Video recorder we had last century - sometimes the assistant allows you to insert extra time before/after the start / end time of the program, for instance 10 min by default in order not to miss the end in case of delay; following the same logic, perhaps this would be interesting to be able to squeeze without fitting exactly up to the actual data length but adding an extra ratio, say n% (10%, 15% of your choosing) in addition to the minimum calculated ?

 

Best Regards

Ronan

Version history
Last update:
‎03-02-2022 12:34 PM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags