<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Why do Macro %squeeze and option compress and the change in bytes of file seems not align? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726186#M225648</link>
    <description>&lt;P&gt;Hi all,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In an effort to reduce the file size, I find a macro named &lt;A href="https://support.sas.com/kb/24/805.html" target="_self"&gt;%squeeze&lt;/A&gt;, with the code &lt;A href="https://support.sas.com/kb/24/addl/fusion24805_1_squeeze2.html" target="_self"&gt;here&lt;/A&gt;, and I try to apply it with my dataset, I feel quite strange because the result is not as what I expected.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a compressed dataset&amp;nbsp;&lt;STRONG&gt;ex_non_trading&lt;/STRONG&gt; (I get this dataset by using options &lt;STRONG&gt;compress=yes&lt;/STRONG&gt; in another datastep). I follow the macro &lt;STRONG&gt;%squeeze&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options compress=yes reuse=yes;
%squeeze(my.ex_non_trading, squozennn)
proc contents data=my.ex_non_trading;
run;
proc contents data=squozennn;
run;

proc means data=my.ex_non_trading;
title 'ex_non_trading';
run;

proc means data=squozennn;
title 'squozennn';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;and the output is like that&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="My97_0-1615782002949.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/55932i0BA7578E08B4D414/image-size/large?v=v2&amp;amp;px=999" role="button" title="My97_0-1615782002949.png" alt="My97_0-1615782002949.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="My97_1-1615782020654.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/55933i4F3D5AA9A51E36E9/image-size/large?v=v2&amp;amp;px=999" role="button" title="My97_1-1615782020654.png" alt="My97_1-1615782020654.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;We can see the file sizes of two datasets are not really different.&lt;/P&gt;
&lt;P&gt;And I have a look on the log, I saw that &lt;STRONG&gt;options=compress&lt;/STRONG&gt; even reduce around &lt;STRONG&gt;70%&lt;/STRONG&gt; of the file size&lt;/P&gt;
&lt;PRE&gt;NOTE: There were 10978714 observations read from the data set MY.EX_NON_TRADING.
NOTE: The data set WORK.SQUOZENNN has 10978714 observations and 15 variables.
NOTE: Compressing data set WORK.SQUOZENNN decreased size by 70.05 percent. 
      Compressed is 20550 pages; un-compressed would require 68618 pages.
NOTE: DATA statement used (Total process time):
      real time           33.49 seconds
      cpu time            9.29 seconds
      

207        proc contents data=my.ex_non_trading;
208        run;

NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.05 seconds
      cpu time            0.03 seconds
      

209        proc contents data=squozennn;
210        run;
&lt;/PRE&gt;
&lt;P&gt;And I try to run the macro %squeeze without option=compress, the output squozennn now is up to 4GB, four times compared to the original &lt;STRONG&gt;ex_non_trading&lt;/STRONG&gt; .So surprise to me&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="My97_0-1615783375802.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/55934i510B0372E864A9B2/image-size/large?v=v2&amp;amp;px=999" role="button" title="My97_0-1615783375802.png" alt="My97_0-1615783375802.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And I also have a look on &lt;A href="http://Compressing%20a file is a process that reduces the number of bytes required to represent each observation. In a compressed file, each observation is a variable-length record, while in an uncompressed file, each observation is a fixed-length record" target="_self"&gt;another document&lt;/A&gt; about &lt;STRONG&gt;option=compress&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;It documents that&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;Compressing a file is a process that reduces the number of bytes required to &lt;STRONG&gt;represent each observation&lt;/STRONG&gt;. In a compressed file, &lt;STRONG&gt;each observation is a variable-length record&lt;/STRONG&gt;, while in an uncompressed file, each observation is a fixed-length record&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;So, in this case, whether we need to use macro &lt;STRONG&gt;%squeeze&lt;/STRONG&gt; while&lt;STRONG&gt; options=compress&lt;/STRONG&gt; has done all the things? Because from my understanding, %squeeze is to help to retrieve the highest length for each variable, but &lt;STRONG&gt;option=compres&lt;/STRONG&gt;s did it for each observation.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Warmest regards.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;P/S: And woops, I also found the macro named &lt;A href="https://www.lexjansen.com/nesug/nesug06/io/io18.pdf" target="_self"&gt;%squeeze1&lt;/A&gt;, I am wondering if any of you used to apply this code and I am wondering if it works well?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 15 Mar 2021 04:46:31 GMT</pubDate>
    <dc:creator>Phil_NZ</dc:creator>
    <dc:date>2021-03-15T04:46:31Z</dc:date>
    <item>
      <title>Why do Macro %squeeze and option compress and the change in bytes of file seems not align?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726186#M225648</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In an effort to reduce the file size, I find a macro named &lt;A href="https://support.sas.com/kb/24/805.html" target="_self"&gt;%squeeze&lt;/A&gt;, with the code &lt;A href="https://support.sas.com/kb/24/addl/fusion24805_1_squeeze2.html" target="_self"&gt;here&lt;/A&gt;, and I try to apply it with my dataset, I feel quite strange because the result is not as what I expected.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a compressed dataset&amp;nbsp;&lt;STRONG&gt;ex_non_trading&lt;/STRONG&gt; (I get this dataset by using options &lt;STRONG&gt;compress=yes&lt;/STRONG&gt; in another datastep). I follow the macro &lt;STRONG&gt;%squeeze&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options compress=yes reuse=yes;
%squeeze(my.ex_non_trading, squozennn)
proc contents data=my.ex_non_trading;
run;
proc contents data=squozennn;
run;

proc means data=my.ex_non_trading;
title 'ex_non_trading';
run;

proc means data=squozennn;
title 'squozennn';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;and the output is like that&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="My97_0-1615782002949.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/55932i0BA7578E08B4D414/image-size/large?v=v2&amp;amp;px=999" role="button" title="My97_0-1615782002949.png" alt="My97_0-1615782002949.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="My97_1-1615782020654.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/55933i4F3D5AA9A51E36E9/image-size/large?v=v2&amp;amp;px=999" role="button" title="My97_1-1615782020654.png" alt="My97_1-1615782020654.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;We can see the file sizes of two datasets are not really different.&lt;/P&gt;
&lt;P&gt;And I have a look on the log, I saw that &lt;STRONG&gt;options=compress&lt;/STRONG&gt; even reduce around &lt;STRONG&gt;70%&lt;/STRONG&gt; of the file size&lt;/P&gt;
&lt;PRE&gt;NOTE: There were 10978714 observations read from the data set MY.EX_NON_TRADING.
NOTE: The data set WORK.SQUOZENNN has 10978714 observations and 15 variables.
NOTE: Compressing data set WORK.SQUOZENNN decreased size by 70.05 percent. 
      Compressed is 20550 pages; un-compressed would require 68618 pages.
NOTE: DATA statement used (Total process time):
      real time           33.49 seconds
      cpu time            9.29 seconds
      

207        proc contents data=my.ex_non_trading;
208        run;

NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.05 seconds
      cpu time            0.03 seconds
      

209        proc contents data=squozennn;
210        run;
&lt;/PRE&gt;
&lt;P&gt;And I try to run the macro %squeeze without option=compress, the output squozennn now is up to 4GB, four times compared to the original &lt;STRONG&gt;ex_non_trading&lt;/STRONG&gt; .So surprise to me&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="My97_0-1615783375802.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/55934i510B0372E864A9B2/image-size/large?v=v2&amp;amp;px=999" role="button" title="My97_0-1615783375802.png" alt="My97_0-1615783375802.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And I also have a look on &lt;A href="http://Compressing%20a file is a process that reduces the number of bytes required to represent each observation. In a compressed file, each observation is a variable-length record, while in an uncompressed file, each observation is a fixed-length record" target="_self"&gt;another document&lt;/A&gt; about &lt;STRONG&gt;option=compress&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;It documents that&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;Compressing a file is a process that reduces the number of bytes required to &lt;STRONG&gt;represent each observation&lt;/STRONG&gt;. In a compressed file, &lt;STRONG&gt;each observation is a variable-length record&lt;/STRONG&gt;, while in an uncompressed file, each observation is a fixed-length record&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;So, in this case, whether we need to use macro &lt;STRONG&gt;%squeeze&lt;/STRONG&gt; while&lt;STRONG&gt; options=compress&lt;/STRONG&gt; has done all the things? Because from my understanding, %squeeze is to help to retrieve the highest length for each variable, but &lt;STRONG&gt;option=compres&lt;/STRONG&gt;s did it for each observation.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Warmest regards.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;P/S: And woops, I also found the macro named &lt;A href="https://www.lexjansen.com/nesug/nesug06/io/io18.pdf" target="_self"&gt;%squeeze1&lt;/A&gt;, I am wondering if any of you used to apply this code and I am wondering if it works well?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Mar 2021 04:46:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726186#M225648</guid>
      <dc:creator>Phil_NZ</dc:creator>
      <dc:date>2021-03-15T04:46:31Z</dc:date>
    </item>
    <item>
      <title>Re: Why do Macro %squeeze and option compress and the change in bytes of file seems not align?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726191#M225650</link>
      <description>&lt;P&gt;In my experience a lot of SAS sites have COMPRESS = YES switched on as a permanent session option because it can both reduce disk storage significantly as well as reducing IO. You might also try COMPRESS = BINARY as that can sometimes do better than YES.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I never bother with %SQUEEZE as requires additional processing with unpredictable results.&lt;/P&gt;</description>
      <pubDate>Mon, 15 Mar 2021 05:37:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726191#M225650</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2021-03-15T05:37:39Z</dc:date>
    </item>
    <item>
      <title>Re: Why do Macro %squeeze and option compress and the change in bytes of file seems not align?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726214#M225656</link>
      <description>&lt;P&gt;%SQUEEZE reduces the defined length of variables, e.g. the numeric length of a date to 4.&lt;/P&gt;
&lt;P&gt;COMPRESS reduces the used length by compressing sequences of repeated characters (mainly the blanks).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Squeezed datasets may cause problems later, if you have to combine datasets where the defined lengths differ because of the content. COMPRESS on its own never poses such a problem; there are some datasets where compressing actually increases the filesize, but not by a large margin.&lt;/P&gt;</description>
      <pubDate>Mon, 15 Mar 2021 08:12:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726214#M225656</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2021-03-15T08:12:05Z</dc:date>
    </item>
    <item>
      <title>Re: Why do Macro %squeeze and option compress and the change in bytes of file seems not align?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726217#M225658</link>
      <description>&lt;P&gt;You are much better off storing your data using the SPDE engine with binary compression than any other method. And no need to end up with unpredictable variables lengths (that will give you headaches when merging) if you do that.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Mar 2021 08:21:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-do-Macro-squeeze-and-option-compress-and-the-change-in-bytes/m-p/726217#M225658</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-03-15T08:21:53Z</dc:date>
    </item>
  </channel>
</rss>

