Hi,
I am trying to set BUFSiZE and BUFNO to read a large dataset (20 millions rows)
My page size is 131K
I would like to know if changinf BUFSIZE or BUFNO I can have better performance.
Do I have to set BUFSIZE in the creation of dataset??
data tabla(BUFSIZE=131K);
set.....
If I create like this, when reading the dataset the bufsize was the specified in this creation data step ??
Is ther any rule to calculate an optimal value??
Thanks in advance
Yes bufsize is set when the table is created.
Bufno is set when the table is used.
These option do affect performance. The data set's buffer size, the number of buffers used, whether or not Windows's Direct-IO is used all have a huge effect. To illustrate this point, here is a benchmark example taken from
https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490
Table 5.5: Run times for various values of BUFSIZE, BUFNO and SGIO for a 10,000 MB data set
10,000,000 kB
|
BUFNO |
|||||
1 |
5 |
25 |
100 |
500 |
||
BUFSIZE |
SGIO |
195.45 |
. |
. |
. |
. |
0 |
no |
|||||
4k |
no |
235.25 |
236.25 |
235.56 |
239.78 |
235.39 |
yes |
1177.55 |
572.32 |
157.28 |
93.60 |
93.96 |
|
8k |
no |
163.77 |
162.05 |
164.22 |
164.05 |
161.34 |
yes |
542.85 |
268.86 |
95.45 |
81.07 |
80.84 |
|
16k |
no |
138.03 |
140.26 |
136.16 |
137.49 |
136.11 |
yes |
265.46 |
142.55 |
75.28 |
74.61 |
81.70 |
|
32k |
no |
143.70 |
148.85 |
137.96 |
147.67 |
143.98 |
yes |
140.56 |
93.84 |
73.35 |
73.33 |
75.48 |
|
64k |
no |
180.47 |
171.39 |
166.63 |
151.69 |
173.16 |
yes |
88.62 |
70.14 |
71.62 |
72.00 |
. |
|
128k |
no |
235.25 |
236.25 |
235.56 |
239.78 |
235.39 |
yes |
70.72 |
71.30 |
71.34 |
70.33 |
. |
The default run time is 195 seconds, which can be reduced to 70s by simply changing three memory usage settings.
Yes bufsize is set when the table is created.
Bufno is set when the table is used.
These option do affect performance. The data set's buffer size, the number of buffers used, whether or not Windows's Direct-IO is used all have a huge effect. To illustrate this point, here is a benchmark example taken from
https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490
Table 5.5: Run times for various values of BUFSIZE, BUFNO and SGIO for a 10,000 MB data set
10,000,000 kB
|
BUFNO |
|||||
1 |
5 |
25 |
100 |
500 |
||
BUFSIZE |
SGIO |
195.45 |
. |
. |
. |
. |
0 |
no |
|||||
4k |
no |
235.25 |
236.25 |
235.56 |
239.78 |
235.39 |
yes |
1177.55 |
572.32 |
157.28 |
93.60 |
93.96 |
|
8k |
no |
163.77 |
162.05 |
164.22 |
164.05 |
161.34 |
yes |
542.85 |
268.86 |
95.45 |
81.07 |
80.84 |
|
16k |
no |
138.03 |
140.26 |
136.16 |
137.49 |
136.11 |
yes |
265.46 |
142.55 |
75.28 |
74.61 |
81.70 |
|
32k |
no |
143.70 |
148.85 |
137.96 |
147.67 |
143.98 |
yes |
140.56 |
93.84 |
73.35 |
73.33 |
75.48 |
|
64k |
no |
180.47 |
171.39 |
166.63 |
151.69 |
173.16 |
yes |
88.62 |
70.14 |
71.62 |
72.00 |
. |
|
128k |
no |
235.25 |
236.25 |
235.56 |
239.78 |
235.39 |
yes |
70.72 |
71.30 |
71.34 |
70.33 |
. |
The default run time is 195 seconds, which can be reduced to 70s by simply changing three memory usage settings.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.