Solved: Re: SAS optimizations with MS T-SQL and robust systems/servers suggest... - Page 2

Ksharp · Posted 01-04-2016 08:06 PM

Did you try Pass Through way to pass MS-SQL statement into SQL Database ? Do not use SAS itself SQL .

I am not sure if it could work , but deserve a try.

jroe · Posted 01-05-2016 08:58 AM

yesterday I got it to work which was cool but ended up being slow due to null vs '' (blanks) in the data which are handled seperately in sql while sas will help leverage them and treat them the same. I wanted to use sas to compare two tables (one created from sas and exported to sql and another created in sql)... when the query is executed in sql it takes a while since it will return results (mismatches since null and '' are not the same). Sas helps treat them the same and was much faster since there were no mismatches in that regard.

However when using pass through procsql I was able to see 16 cores used on the sql server vs normal procsql (1 core). I would like to see procsql even though it wont perform the join without passthrough use the multiple cores on the server. I'm not sure if it is a limitation of how procsql communicates on odbc but it is odd that MS SQL doesn't optimize the procsql queries for multithreading unless passthrough is used yet still allows the data to be read.

I have not been able to find a way for SAS during procsql selects without fedsql or passthrough be able to use more than one core on the SQL server(MS TSQL). All of the optimizations primarily appear to effect local pc sas performance and not the connected library via odbc to MS TSQL.

jakarman · Posted 01-05-2016 11:04 AM

When you are trying to optimize read speed look also at : dbslice dbsliceparm, https://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a002316130.htm

This only helps when the other bottlenecks are not more important.
Reading all dat unsorted from a DBMS can be very fast, In those situations you are adding overhead by wanting to run threads.

The sequential IO reading data in that order from disk is hardly to be improved. In contrary it is getting slowed down (spinning ones)

---->-- ja karman --<-----

jroe · Posted 01-05-2016 11:54 AM

I have tested those and currently have them included in the config file, options and libname. It helped locally in SAS on the workstation to use more than one core but not in SQL on the MS SQL server where only one core is being used when SAS code is executed.

The problem appears to be select statements are coming over single threaded to the MS SQL server. If you run a similar select statement 8-16 cores are used in SQL.

The procsql over odbc appears to use minimal resources when pulling data from the server except when sql passthrough specifically.

In this case the data is pulled from two different libnames and the warning emerges that the joins will be processed in SAS, I tried several directsql commands and they had no effects on the SQL server side of the pull. It appears that the ODBC sas pull from the select is only single threaded on the server (although multithreaded slightly locally).

Locally maybe 1-5% of the performance is increased with all the optimizations and the cpu usage goes up. The real gains would occur if during the procsql select it used multithreading on the server possibly.

jroe · Posted 01-08-2016 10:00 AM

an item of note with sql libnames... the "optimizations" need to be added before the scema and not within the quotes section or it was ignored. No notice in performance but for statements I could see them in action in the log only if I placed them outside the quotes... if you set an invalid value within the quotes it would allow it (ignored) and fail if outside unless set to a proper setting.

ROWSET_SIZE= 3276731; is invalid and ignored...

LIBNAME test ODBC NOPROMPT =
"Driver=SQL Server Native Client 10.0;
Server=tester;
Database=test1;
ROWSET_SIZE= 3276731;
Trusted_Connection=Yes;"
dbsliceparm=(all,4)
ROWSET_SIZE= 32767
readbuff= 32767
Schema = test123;

SASKiwi · Posted 01-10-2016 02:56 PM

This reference is useful when understanding how READBUFF works with SQL Server ODBC:

http://support.sas.com/documentation/cdl/en/acreldb/68028/HTML/default/viewer.htm#p0u61lysy7qgzmn1bd...

I've found that leaving READBUFF default value for SQL Server of 0 works the best in most situations. Ultimately it is the bandwidth of the network connection between SQL Server and SAS that dictates overall performance.

I strongly suggest you explore SQL PASSTHRU as a means of improving performance as has already been discussed in this thread.

Here is an example of what you could try:

proc sql noprint;
  connect to odbc (noprompt = "server=tester;DRIVER=SQL Server Native Client 10.0;Trusted Connection=yes;DATABASE=test1;" );
  create table test as 
  select * from connection to odbc
  (SELECT *
   FROM [tester].[test1].[tablename]
  )
  ;
  disconnect from odbc;
quit;

jroe · Posted 01-11-2016 08:32 AM

Thanks! I had tried something similar but it hadn't worked out. I am going to give this a try.

🙂

Re: SAS optimizations with MS T-SQL and robust systems/servers suggestions

Re: SAS optimizations with MS T-SQL and robust systems/servers suggestions

Re: SAS optimizations with MS T-SQL and robust systems/servers suggestions

Re: SAS optimizations with MS T-SQL and robust systems/servers suggestions

Re: SAS optimizations with MS T-SQL and robust systems/servers suggestions

Re: SAS optimizations with MS T-SQL and robust systems/servers suggestions

Re: SAS optimizations with MS T-SQL and robust systems/servers suggestions