<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Character vs Binary compression in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37983#M7607</link>
    <description>We have an example in our advanced programming/efficiencies course where this overhead could actually make size -increase-. So "it depends" or "your mileage may vary" or "you gotta benchmark it on your data" is about all anyone can say. I believe that there are several good explanations of compression and the overhead, out and about in papers, doc, etc, including these:&lt;BR /&gt;
&lt;A href="http://www2.sas.com/proceedings/sugi28/003-28.pdf" target="_blank"&gt;http://www2.sas.com/proceedings/sugi28/003-28.pdf&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://support.sas.com/resources/papers/proceedings09/065-2009.pdf" target="_blank"&gt;http://support.sas.com/resources/papers/proceedings09/065-2009.pdf&lt;/A&gt;&lt;BR /&gt;
    &lt;BR /&gt;
cynthia</description>
    <pubDate>Mon, 22 Nov 2010 19:14:11 GMT</pubDate>
    <dc:creator>Cynthia_sas</dc:creator>
    <dc:date>2010-11-22T19:14:11Z</dc:date>
    <item>
      <title>Character vs Binary compression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37981#M7605</link>
      <description>Hi&lt;BR /&gt;
&lt;BR /&gt;
I would have expected that in the example below binary compression reduces the size of the data set more than character compression.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Actually the opposite is true (SAS9.2, Win7, 64Bit):&lt;BR /&gt;
&lt;BR /&gt;
NOTE: Compressing data set WORK.HAVEB decreased size by 45.84 percent.&lt;BR /&gt;
      Compressed is 645 pages; un-compressed would require 1191 pages.&lt;BR /&gt;
&lt;BR /&gt;
NOTE: Compressing data set WORK.HAVEC decreased size by 51.97 percent.&lt;BR /&gt;
      Compressed is 572 pages; un-compressed would require 1191 pages.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
data haveB(COMPRESS=binary);&lt;BR /&gt;
  do i=1 to 1000000 by 10;&lt;BR /&gt;
    a='              x                   ';&lt;BR /&gt;
    output;&lt;BR /&gt;
  end;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data haveC(COMPRESS=yes);&lt;BR /&gt;
  do i=1 to 1000000 by 10;&lt;BR /&gt;
    a='              x                   ';&lt;BR /&gt;
    output;&lt;BR /&gt;
  end;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Can someone explain me this behaviour?&lt;BR /&gt;
&lt;BR /&gt;
Thanks&lt;BR /&gt;
Patrick

Message was edited by: Patrick</description>
      <pubDate>Mon, 22 Nov 2010 09:16:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37981#M7605</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2010-11-22T09:16:25Z</dc:date>
    </item>
    <item>
      <title>Re: Character vs Binary compression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37982#M7606</link>
      <description>To implement compression of either type SAS  inserts information about where and how much compression is made, as well as the compressed values in each page of data that gets written. &lt;BR /&gt;
There is also row-level overhead.  &lt;BR /&gt;
ymmv&lt;BR /&gt;
Your experience would vary depending on the mix of data-types and how wide/narrow the row is.    I find character compression adequate in the balancing of CPU and I/O &lt;BR /&gt;
ymmv&lt;BR /&gt;
peterC</description>
      <pubDate>Mon, 22 Nov 2010 17:11:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37982#M7606</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2010-11-22T17:11:19Z</dc:date>
    </item>
    <item>
      <title>Re: Character vs Binary compression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37983#M7607</link>
      <description>We have an example in our advanced programming/efficiencies course where this overhead could actually make size -increase-. So "it depends" or "your mileage may vary" or "you gotta benchmark it on your data" is about all anyone can say. I believe that there are several good explanations of compression and the overhead, out and about in papers, doc, etc, including these:&lt;BR /&gt;
&lt;A href="http://www2.sas.com/proceedings/sugi28/003-28.pdf" target="_blank"&gt;http://www2.sas.com/proceedings/sugi28/003-28.pdf&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://support.sas.com/resources/papers/proceedings09/065-2009.pdf" target="_blank"&gt;http://support.sas.com/resources/papers/proceedings09/065-2009.pdf&lt;/A&gt;&lt;BR /&gt;
    &lt;BR /&gt;
cynthia</description>
      <pubDate>Mon, 22 Nov 2010 19:14:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37983#M7607</guid>
      <dc:creator>Cynthia_sas</dc:creator>
      <dc:date>2010-11-22T19:14:11Z</dc:date>
    </item>
    <item>
      <title>Re: Character vs Binary compression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37984#M7608</link>
      <description>Hi Cynthia, Peter&lt;BR /&gt;
&lt;BR /&gt;
Thanks for your answers. &lt;BR /&gt;
&lt;BR /&gt;
And yes: RTM - got it.&lt;BR /&gt;
&lt;BR /&gt;
Seems I was a bit naive to assume that binary compression uses a stronger compression algorithm in general (I had to much zip compression in mind).&lt;BR /&gt;
&lt;BR /&gt;
Thanks&lt;BR /&gt;
Patrick</description>
      <pubDate>Tue, 23 Nov 2010 08:08:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37984#M7608</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2010-11-23T08:08:55Z</dc:date>
    </item>
    <item>
      <title>Re: Character vs Binary compression</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37985#M7609</link>
      <description>As a very general rule of thumb, binary compression is not effective unless the observation length is greater than a few hundred bytes.  Of course, there are many additional considerations and I’m sure that someone could come up with some counterexamples.  Reference &lt;A href="http://support.sas.com/documentation/cdl/en/lrdict/63026/HTML/default/viewer.htm#a001288760.htm" target="_blank"&gt;http://support.sas.com/documentation/cdl/en/lrdict/63026/HTML/default/viewer.htm#a001288760.htm&lt;/A&gt; for details.</description>
      <pubDate>Wed, 24 Nov 2010 14:26:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Character-vs-Binary-compression/m-p/37985#M7609</guid>
      <dc:creator>polingjw</dc:creator>
      <dc:date>2010-11-24T14:26:50Z</dc:date>
    </item>
  </channel>
</rss>

