I originally wrote a macro, %SQUEEZE (http://support.sas.com/kb/24/804.html) to compute the optimal length of every variable in a SAS dataset and thereby reduce the disk space required for a specified dataset. No information is lost in the process, e.g., numerical accuracy is not reduced, and long character variables that contain mostly trailing blanks are redefined in length to contain the longest string of nonblank characters.
Then I created another macro, %SQZ_LIBRARY, that invokes the %SQUEEZE macro on all of the SAS datasets contained in a SAS data library. It is available at https://www.lexjansen.com/wuss/2011/posters/Papers_Bettinger_R_74821.pdf.
Significant reductions in disk space may be achieved by applying the %SQZ_LIBRARY macro to SAS data libraries as a data management tool.
I have attached the PDF document describing the %SQUEEZE and %SQZ_LIBRARY macros. The SQZ_LIBRARY.sas file contains the %SQUEEZE and %SQZ_LIBRARY programs.
I assume the optimal is on the existing data. As time goes, even the dataset/column names are untouched, the length could be changing...
is there risk when you are to stack datasets up?!
or there is a waring, or backup alike?!
Hi,
Thanks for sharing, this is a much needed tool to improve the overall performance, spare some unneeded storage volume and streamline the data model as well - all in one pass ! I checked the syntax of the %SQUEEZE macro and I wondered about a safety feature to be considered : when you schedule a record in your favorite TV Box - like in the ancient VHS tape Video recorder we had last century - sometimes the assistant allows you to insert extra time before/after the start / end time of the program, for instance 10 min by default in order not to miss the end in case of delay; following the same logic, perhaps this would be interesting to be able to squeeze without fitting exactly up to the actual data length but adding an extra ratio, say n% (10%, 15% of your choosing) in addition to the minimum calculated ?
Best Regards
Ronan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.