10-12-2017 09:54 AM - edited 10-12-2017 10:01 AM
Hello, often when a team member needs to select a larger data set from MS SQL over to SAS I will see a wait type indicator in SQL "async_network_IO" which essentially is "Waiting for network". At this time other sql databases (non-sas) and overall performance will degrade on the server due to the large network overhead and data transfer.
a. select * from mytable
b. select a,b, c, d, n(100+ columns) from my table (needed to recreate in SAS as a data set)
Data is moved from large tables (100-200gb in sql) over the network to SAS.
a. The end machines are not enabled for jumbo frames
b. SQL works in packets of 4092 bytes size, default non-jumbo fram back is 1492/1500 bytes. (4kb split to 1kb)
I have tried changing the packet size via libname statement which has no effect. Also tried newer SQL "natural" drivers to no avail. Any other ideas to reduce network io issues? My concern is SAS is not reading the data fast enough to keep up with the network and SQL. Any libname optimizations?
LIBNAME myschema ODBC NOPROMPT =
"Driver=SQL Server Native Client 11.0;
10-12-2017 10:05 AM
Those 1492/1500 byte sizes point to some choke point (a hop over a WAN?) in your connection that limits packet sizes.
The most painless (for the database) way to export data to SAS is to unload to a text file on the database server, transfer that file with (S)FTP, and read it on SAS with a data step (or read it directly with filename FTP). FTP transfers are very economic in terms of CPU usage, we do our transfers to/from DB/2 this way for 20 years now.
10-12-2017 10:22 AM
Thanks for the feedback.
a. 1492/1500 is the default packet size of a network card in the Windows environment without jumbo frames enabled on the NIC and your router/swich.
b. FTP is not practical for actuarial work (hundreds of tables, databases, employees etc) utilizing SQL as a data warehouse.
c. SQL to SQL or non-SAS applications often do not have this issue. Any libname suggestions ?
The error can occur from network performance but it also can occur due to SAS not being able to keep up with the number of rows being sent over. SAS by default wants to read < 1500 rows at a time which is not a modern setting.
11-07-2017 05:56 PM
Figured out two reasons:
a. proc sql: if you use compress in join/where statements, conversions in joins/where statements (substring etc), or use "calculated" valuable it will cause the app to go multithreaded to single threaded, often inducing the network ao errors.
b. data steps appear (non proc sql) to also be single threaded and prone to this issue when executed against at able in MS SQL.
11-15-2017 02:32 PM
another item that may be helping:
a. for the hosted server (SQL and if applicable a SAS server)... ideally your server between iscsi/raid/SAN should have JUMBO frames enabled. In most cases unless everything is hosted, jumbo frames may not be enabled on the endpoints (desktops requesting the data) and most likely shouldnt be.
b. however for the NIC(s) of the servers... adjust the read buffer and transmit buffer to the max.
example for HP 10GB nic:
read buffer: set to 4096
(4mb and 16MB I believe)
Some nics only support 256 or 512 or 1024 maxes. Often READ is 1/4 or 1/2 of transmit. This helps reduce packet issues.