11-09-2011 08:17 PM
If I use WHERE clause in a DATA step to search for a non-existing record in a 500Mb SAS dataset (roughly 4 million records, 20 variables) stored in a LAN directory (NOTE there is no SAs server and I am using a PC standalone Base SAS), can anyone tell me how much LAN traffic will occur?
Is it correct that the whole dataset will be transferred once to my PC so the traffic is kind of 500Mb download and not much upload?
I routinly read from/write to SAS datafiles stored on LAN, and join multiple datafiles as well. Any ideas/principles that I should bear in mind in order to reduce the LAN traffic or simply spped up my SAS codes in this scenario?
Thanks a lot.
11-09-2011 11:29 PM
For SAS on a PC to process SAS data from a file server ALL data must first be read across the LAN to the PC, and then any SAS data that is to be stored back on the file server must then be written back across the LAN as well. This can slow down SAS processing considerably. If performance is slow or you want to reduce network traffic, it is better to copy your file server SAS data to the PC hard drive, do all of the processing required using just the hard drive for storage, then only write it back to the file server when all processing is complete.
11-10-2011 05:25 PM
What about using PROC SQL and WHERE clause to search for an observation in a relational database table on a SQL Server? I mean to use a LIBNAME pointing to the database, and refer to the SQL table directly in PROC SQL, something like:
LIBNAME lib_sql mySQLdatabase;
CREATE TABLE test AS
Does the whole table need to be copied to local drive then SAS can do the processing ?
Or, maybe the SQL server does partial data filtering and only the found observation or none being transferred ? - if this is the case the implication probably could be SQL database can be superiorer than SAS datafiles in terms of efficiency in data processing and data traffics?
11-14-2011 08:52 AM
For your SQL server approach, the WHERE clause MAY (or MAY NOT) execute on the remote server. It depends on whether the WHERE clause includes any functions or operators that are NOT supported on the SQL Server. SAS will try to create native SQL code and do all the work on the SQL server, but if it is unsuccessful, then it will transfer the data to the PC and process it there. To guarantee that the code is executed on the SQL server, use Pass Thru SQL from SAS.
See the SAS/Access documentation for your flavor of SQL for debugging options and functions that may be constrained.
11-10-2011 06:07 AM
get the data closer to the SAS processing
Even using a SAS server won't speed up the processing if the data are not close to that server.
When using desktop SAS with large data, collect as much as possible in locally attached drives (an external hard drive is cheap, and overnight time is cheap for the day-time analyst).
Don't use these local copies as the only copies.
There might be some advantage in having more than one locally attached drive, but that depends on configuration.
hope this proves useful