<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic What determines data set page size? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615798#M180164</link>
    <description>&lt;P&gt;I'm migrating to a new linux server, and I just noticed that a one-record dataset with 3 variables takes up 1.5MB.&amp;nbsp; On the prior server it took 128K.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I run:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data a ;
  length x 8 y $8 z $100 ;
  x=1 ;
  y='1' ;
  z='1' ;
run ;

proc contents data=a ;
run ;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I get:&lt;/P&gt;
&lt;PRE&gt;Data Set Page Size 524288 
Number of Data Set Pages 2 
Number of Data Set Repairs 0 
Filename /saswork/.../a.sas7bdat 
Release Created 9.0401M6 
Host Created Linux 
File Size 2MB 
File Size (bytes) 1572864 
&lt;/PRE&gt;
&lt;P&gt;I noticed the Data Set Page Size is much bigger that the prior server.&amp;nbsp; I think it was 65,536 on the old server.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Both prior server and new server are Linux.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there a SAS option that determines page size, or is it an OS thing?&amp;nbsp; I have a good number of small datasets with control data.&amp;nbsp; I hate to think that they could each take a 1MB to store them.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any other reason a small data set would suddenly take up a lot of disk space?&lt;/P&gt;</description>
    <pubDate>Tue, 07 Jan 2020 21:04:44 GMT</pubDate>
    <dc:creator>Quentin</dc:creator>
    <dc:date>2020-01-07T21:04:44Z</dc:date>
    <item>
      <title>What determines data set page size?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615798#M180164</link>
      <description>&lt;P&gt;I'm migrating to a new linux server, and I just noticed that a one-record dataset with 3 variables takes up 1.5MB.&amp;nbsp; On the prior server it took 128K.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I run:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data a ;
  length x 8 y $8 z $100 ;
  x=1 ;
  y='1' ;
  z='1' ;
run ;

proc contents data=a ;
run ;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I get:&lt;/P&gt;
&lt;PRE&gt;Data Set Page Size 524288 
Number of Data Set Pages 2 
Number of Data Set Repairs 0 
Filename /saswork/.../a.sas7bdat 
Release Created 9.0401M6 
Host Created Linux 
File Size 2MB 
File Size (bytes) 1572864 
&lt;/PRE&gt;
&lt;P&gt;I noticed the Data Set Page Size is much bigger that the prior server.&amp;nbsp; I think it was 65,536 on the old server.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Both prior server and new server are Linux.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there a SAS option that determines page size, or is it an OS thing?&amp;nbsp; I have a good number of small datasets with control data.&amp;nbsp; I hate to think that they could each take a 1MB to store them.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any other reason a small data set would suddenly take up a lot of disk space?&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2020 21:04:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615798#M180164</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2020-01-07T21:04:44Z</dc:date>
    </item>
    <item>
      <title>Re: What determines data set page size?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615820#M180167</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/19879"&gt;@Quentin&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;On my Windows workstation (with SAS 9.4M5) the &lt;A href="https://documentation.sas.com/?docsetId=lesysoptsref&amp;amp;docsetTarget=p1d8hx95jb53wqn0zzvawxw94nvi.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_blank" rel="noopener"&gt;BUFSIZE= system option&lt;/A&gt; (which can be overridden by the &lt;A href="https://documentation.sas.com/?docsetId=ledsoptsref&amp;amp;docsetTarget=n0pw7cnugsttken1voc6qo0ye3cg.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_blank" rel="noopener"&gt;BUFSIZE= data set option&lt;/A&gt;) determines data set page size. With the default value 0 (i.e. "minimum optimal buffer size for the operating environment") I get the same result as with BUFSIZE=64k for your test dataset:&lt;/P&gt;
&lt;PRE&gt;Data Set Page Size          65536
Number of Data Set Pages    1
First Data Page             1
Max Obs per Page            545
Obs in First Data Page      1
File Size                   128KB
File Size (bytes)           131072&lt;/PRE&gt;
&lt;P&gt;With BUFSIZE=512k I obtain:&lt;/P&gt;
&lt;PRE&gt;Data Set Page Size          524288
Number of Data Set Pages    1
First Data Page             1
Max Obs per Page            4364
Obs in First Data Page      1
File Size                   1MB
File Size (bytes)           1048576&lt;/PRE&gt;
&lt;P&gt;I don't know why it's &lt;EM&gt;2&lt;/EM&gt; pages on your system.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Data set page size is mostly very close to the BUFSIZE value, e.g. 123904 for BUFSIZE=123456 or 1234944 for BUFSIZE=1234567 (integer multiples of 512, I guess). Smaller values than the "minimum optimal buffer size" are allowed: For example, 32k yields a file size of only 64KB.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2020 21:55:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615820#M180167</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-01-07T21:55:14Z</dc:date>
    </item>
    <item>
      <title>Re: What determines data set page size?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615881#M180202</link>
      <description>&lt;P&gt;Look at the blocksizes of the server filesystems.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2020 10:39:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615881#M180202</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-01-08T10:39:40Z</dc:date>
    </item>
    <item>
      <title>Re: What determines data set page size?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615934#M180223</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;&amp;nbsp;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I confirmed, on the prior server they had the default BUFSIZE=0 (which chooses minimum recommended for the OS, apparently 65,536), but on the new server they have set&amp;nbsp;bufsize=524288.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I guess they're trying to decrease I/O for processing big data sets.&amp;nbsp; The docs say:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P class="xisDoc-paragraph"&gt;The page size is the amount of data that can be transferred from a single input/output operation to one buffer. The page size is a permanent attribute of the data set and is used when the data set is processed.&lt;/P&gt;
&lt;P class="xisDoc-paragraph"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="xisDoc-paragraph"&gt;A larger page size can improve execution time by reducing the number of times SAS has to read from or write to the storage medium. However, the improvement in execution time comes at the expense of increased memory consumption.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P class="xisDoc-paragraph"&gt;I'll double check with the admins to make sure they're happy with the trade-off, but it's their server.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P class="xisDoc-paragraph"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="xisDoc-paragraph"&gt;Was just a surprise yesterday as I was comparing data on the source server to the target server, and realized all these small data sets were taking much more space.&lt;/P&gt;
&lt;P class="xisDoc-paragraph"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="xisDoc-paragraph"&gt;I confirmed on the target server, if I explicitly set bufsize to a 65,536, I'll get a 192K file again:&lt;/P&gt;
&lt;P class="xisDoc-paragraph"&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data a (bufsize=65536);
  length x 8 y $8 z $100 ;
  x=1 ;
  y='1' ;
  z='1' ;
run ;

proc contents data=a ;
run ;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 08 Jan 2020 14:10:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-determines-data-set-page-size/m-p/615934#M180223</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2020-01-08T14:10:39Z</dc:date>
    </item>
  </channel>
</rss>

