BookmarkSubscribeRSS Feed
Florent
Quartz | Level 8

Hello SAS experts,

Could someone please explain me how to determine the best combination of Bufsize/Bufno values to optimize the performances of my SAS program ?

It acts as an ETL for data retrieved from views in an Oracle database (which works fine by using the Readbuff option in the libname). SAS 9.2 runs on a Windows Server on which there are 16 CPUs and 16 Gb of RAM available (but only 4Gb are assigned to the memsize). The options THREADS, CPUCOUNT=ACTUAL and COMPRESS=CHAR are set for the session.

The biggest data set that is extracted and on which transformations are done is about 20 GB big (and keeps on growing everyday since new data gets inserted into the Oracle database). The data set page size is of 8192 and the number of data set pages is 2149668. I would like to set optimal values of Bufsize/Bufno for the whole session, not per data set.

Thanks a lot in advance for your advises.

Kr,

Florent

2 REPLIES 2
Ksharp
Super User

Maby be PROC DBLOAD could get you faster .

jakarman
Barite | Level 11

It is about understanding the hardware and processes. You have tow of them. Getting Oracle to SAS en working in SAS datasets.
All data is transferred in blocks with sizes of 512 byte 4K 64K and more. The less conversion is needed the better the performance will be.

The more data you can keep in memory and minimize needed IO the better the performance will be.

Oracle:

There must be some transfersize. You can check these and optimize that one and having minimized locking and conversions.
You are seeming not to have issues with this one. save the possible questions for later.      

As it is a DBMS the data is stored in pages of some size. this tuning is DBA work. Funny the same pagesize approach is done with SAS. 

SAS 

- Setting the pagesize low (8K) is adding overhead you could use one of 64k SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition

  Theis setting will be the page size (proc contents) of new datasets.

- The bufno options is often not used. you could set it at 8, 16 or higer. it is the caching of the data (datasets) done by SAS.
Normally the OS is dong the caching efficient enough but with big datasets and little memory (16G and 20G datasets) you can spoil the caching of the OS that is becoming negative for the performance. There are notes on that with sgio parameter https://support.sas.com/resources/papers/IOthruSGIO.pdf

- Compresion=binary is RDC SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition. It a zip like approach on the record (not the file) It will work well with datasets with many variables (long physical record)   

- A pitty you are not on 9.3 or up 46954 - SAS® system option settings for best performance in UNIX and Windows environments The Alignsasiofiles is effective for that alignment to the OS. minimizing at that interface the conversions. The Ibuf and ubufsize are new settings with 9.4  as those options getting effective for the other internal used data   

- Them memsize is te setting for internal measure sas usage. Some statistical procedures will need more. You could optimize sort by using as much as possible (memsize 4G sortsize 3Gb) for that.

---->-- ja karman --<-----

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1378 views
  • 4 likes
  • 3 in conversation