Re: SAS: how to make it write proper SQL that do not degrade performan... - Page 2

ScottBass · Posted 10-03-2019 10:48 PM

@ChrisNZ wrote:

> Do you have a sample OLEDB connection libname statement to MS SQL server?

Sorry I don't

https://www.connectionstrings.com/

https://www.connectionstrings.com/sql-server/

https://www.connectionstrings.com/sql-server-native-client-11-0-oledb-provider/

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

ChrisNZ · Posted 04-02-2019 06:28 PM

> I haven't been able to get SAS to consistently write standard code ( I know macro variables and other functions can lead to abnormal code which I often advise folks not to use when talking to SQL).

What is standard code?

SAS does not write code (except for the implicit pass-through optimisations) and macros do no write code either.

Macros don't write abnormal code. This sentence has no meaning as stated.

I see that you found a way to do what you needed:

data SAS.MYNEWTABLE (BULKLOAD=YES BL_RECOVERABLE=NO DBCOMMIT=300000 INSERTBUFF=32767 UPDATE_LOCK_TYPE=TABLE READ_LOCK_TYPE=TABLE); 
  set MYSASDATA;
run;

Well done for finding, testing and implementing these options.

What is the issue now? Is it that you'd like these option to be part of the default libname setting?

Note that proc copy or proc append are faster than a data step for simply copying tables.

Also note that the table creation can be accelerated by setting larger values for options BUFNO and BUFSIZE, and possibly by using Windows's SGIO, if disk I/O is the culprit.

High-Performance SAS Coding - Third Edition

ScottBass · Posted 04-02-2019 09:18 PM

SELECT

  A,B , SUM(PAID) AS PAID

FROM X

WHERE A between 2000 and 3000

GROUP BY

A,B

As others have said, use explicit pass through (esp. if you don't use SAS much). With explicit pass through, SAS is just a client to SQL Server, analogous to Powershell, .Net, SSMS, Toad, etc. IOW, it's just passing SQL to your RDBMS (SQL Server). You could submit the same query via Powershell (Invoke-Sqlcmd) and return the data to the console (say to pipe to ConvertT-Csv)

As to your query, what I think is happening is, SUM() is both a SAS and SQL Server function. If a SAS implicit pass through query contains a SAS function, formats, etc, a more generic query is sent to the RDBMS, then SAS does the aggregation.

I would write your query as explicit pass through as follows:

* set desired SAS options ;
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
options dbidirectexec;
options msglevel=I;
options fullstimer;

* allocate SQL Server libraries ;
%libname_sqlsvr(libref=FOO,server=MY_SS_SERVER,port=,database=my_database,schema=dbo)

proc sql;
   connect using foo;
   create table blah as
   select *
   from connection to foo (
SELECT A
      ,B
      ,SUM(PAID) AS PAID
  FROM X
 WHERE A between 2000 and 3000
GROUP BY
       A
      ,B
   );
quit;

Debug your query in SQL Server Management Studio. The explicit pass through should then have similar performance as via SSMS, barring the network traffic to send the aggregated data back to SAS to create the data set. And hopefully that aggregation is "small".

I see you have Access to ODBC and Access to OLEDB. We use ODBC; if you want to use it (or at least try it) as well, perhaps this may help: https://github.com/scottbass/SAS/blob/master/Macro/libname_sqlsvr.sas

That macro may get you started, otherwise hit the doc for either ODBC or OLEDB.

Hope this helps...

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

jroe · Posted 04-03-2019 10:52 AM

Thanks for the feeback! Is your example pass through sql though or just "as clean/normal sql as possible"?

I agree when I coach folks to not use variables or sas specific stuff their code runs better (it usually rewrites the query better once it hits ms sql"). Thanks!

ErikLund_Jensen · Posted 04-03-2019 11:57 AM

Hi @jroe

It is proper SQL. Everything you write in the connection to xxx ( <sql-code> ); construct is just passed to SQL. So a query copied from Microsoft SQL Server manager will work, as shown in the following example:

LIBNAME xmanbmf ODBC  IGNORE_READ_ONLY_COLUMNS=YES  DATAsrc=EXT_Manuel_BMF  SCHEMA=dbo ;
proc sql;
	connect using xmanbmf as myconnect;
	create table work.blah as
		select *
			from connection to myconnect (

SELECT TOP (1000) [SysID]
,[jobstatus]
,[jobstatus_tekst]
FROM [DWH_MAN_BMF].[dbo].[sd_jobstatus]

			);
	disconnect from myconnect;
quit;

You are not limited to queries, where you use the result as input to SAS. Any SQL command can be passed using the execute ( <sql-code> ) by xxx; construct as shown:

LIBNAME xmanbmf ODBC  IGNORE_READ_ONLY_COLUMNS=YES  DATAsrc=EXT_Manuel_BMF  SCHEMA=dbo ;
proc sql;
	connect using xmanbmf as myconnect;
	execute (

CREATE TABLE [DWH_MAN_BMF].[dbo].[test2_jobstatus](
	[SysID] [int] IDENTITY(1,1) NOT NULL,
	[jobstatus] [char](1) NULL,
	[jobstatus_tekst] [varchar](60) NULL
)

	) by myconnect;
	disconnect from myconnect;
quit;

ScottBass · Posted 10-03-2019 10:56 PM

@ErikLund_Jensen wrote:

Hi @jroe

It is proper SQL. Everything you write in the connection to xxx ( <sql-code> ); construct is just passed to SQL. So a query copied from Microsoft SQL Server manager will work, as shown in the following example:
LIBNAME xmanbmf ODBC  IGNORE_READ_ONLY_COLUMNS=YES  DATAsrc=EXT_Manuel_BMF  SCHEMA=dbo ;
proc sql;
	connect using xmanbmf as myconnect;
	create table work.blah as
		select *
			from connection to myconnect (

SELECT TOP (1000) [SysID]
,[jobstatus]
,[jobstatus_tekst]
FROM [DWH_MAN_BMF].[dbo].[sd_jobstatus]

			);
	disconnect from myconnect;
quit;
You are not limited to queries, where you use the result as input to SAS. Any SQL command can be passed using the execute ( <sql-code> ) by xxx; construct as shown:
LIBNAME xmanbmf ODBC  IGNORE_READ_ONLY_COLUMNS=YES  DATAsrc=EXT_Manuel_BMF  SCHEMA=dbo ;
proc sql;
	connect using xmanbmf as myconnect;
	execute (

CREATE TABLE [DWH_MAN_BMF].[dbo].[test2_jobstatus](
	[SysID] [int] IDENTITY(1,1) NOT NULL,
	[jobstatus] [char](1) NULL,
	[jobstatus_tekst] [varchar](60) NULL
)

	) by myconnect;
	disconnect from myconnect;
quit;

As a very minor tip: I found the below syntax by accident, it's not illustrated in the doc. I find it easier to prevent typos in the connection name:

proc sql;
	connect using xmanbmf as myconnect;
	execute by myconnect (

CREATE TABLE [DWH_MAN_BMF].[dbo].[test2_jobstatus](
	[SysID] [int] IDENTITY(1,1) NOT NULL,
	[jobstatus] [char](1) NULL,
	[jobstatus_tekst] [varchar](60) NULL
)

	);  /* put the "by myconnect" after execute, not after the final parentheses */
	disconnect from myconnect;
quit;

As I said, minor but HTH...

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

ChrisNZ · Posted 04-03-2019 05:31 PM

>I agree when I coach folks to not use variables or sas specific stuff their code runs better (it usually rewrites the query better once it hits ms sql").

SAS-specific syntax can vastly improve performance in SAS just as T-SQL-specific features can do the same in SQL Server.

They are both useful and ruling out some elements of a language is counter-productive.

That's why the very useful option to use explicit pass-through exists: to run native code on the appropriate platform as needed.

High-Performance SAS Coding - Third Edition

Tom · Posted 04-03-2019 12:11 PM

I am not sure I understand your point. In SQL code you just describe what output set you want to create. It is up the the implementer (the database that is accepting the SQL) to do any needed optimizations. It can be very tricky. Queries that work great in Oracle might cause havoc in Teradata and the reverse.

JroeJroe · Posted 10-03-2019 10:26 AM

I still see a high amount of MS SQL wait types of ASYNC_NETWORK_IO (SAS reads data slower than SQL sends it), it would be great to find a way to reduce this wait type, I have tried mulitple drivers, hints, options, libname tweaks etc without success. It primarily happens when SAS runs data "single threaded" instead of parallel processing, primarily when pulling back fields without a complex where/group by clause (that triggers the code to run on the server then pull back the summary). I would love this wait type to go away, it only occurs when SAS communicates to sql.

https://www.sqlshack.com/reducing-sql-server-async_network_io-wait-type/

The server is enterprise with 768GB of ram and 24 cores (xeon), it can read/write at 2000MBs+. Often users are pulling data down to their slower local drives into sas, which it can help if a "summary is ran against sql" instead of reproducing tables locally.

A few tips that I have passed on to others:

a. if your joining a table that exists in SAS to SQL, that table doesnt exist in SQL so SAS will "select the columns needed and pull them back to sas then join", so they could add the table to sql if needed (and index it) or index their table in sas

b. remove "SAS" code that doesnt exist in sql (macro variables, certain functions etc)

Tom · Posted 10-03-2019 12:47 PM

You should probably ask SAS support about the ASYNC_NETWORK_IO waits.

Are you using SAS/Access SQL server?

https://documentation.sas.com/?docsetId=acreldb&docsetTarget=p0378ek9nfsu5bn1cpsc8woi6qjf.htm&docset...

Or cobbling something together using ODBC?

If you want to get multi-threaded I/O you will probably have better luck using the SAS product. But ask SAS about it before paying for it. (or see if they will let you use it on a trail basis.)

JroeJroe · Posted 10-03-2019 01:11 PM

odbc libname.

any plugin you are referencing?

For Base SAS Software ...
Custom version information: 9.4_M6

For SAS/STAT ...
Custom version information: 15.1
For SAS/GRAPH ...
Custom version information: 9.4_M6
For SAS/CONNECT ...
Custom version information: 9.4_M6
For High Performance Suite ...
Custom version information: 2.2_M7
For SAS/ACCESS Interface to PC Files ...
Custom version information: 9.4_M6
For SAS/ACCESS Interface to ODBC ...
Custom version information: 9.4_M6
NOTE: PROCEDURE PRODUCT_STATUS used

Tom · Posted 10-03-2019 01:46 PM

Does not looks like you have it. At least it is not installed.

Talk to your SAS Sales Rep about whether it would help and how to get it.

ChrisNZ · Posted 10-03-2019 05:45 PM

It seems from what you describe that the SAS end has much slower drives, which would explain all the waiting, no?

High-Performance SAS Coding - Third Edition

JroeJroe · Posted 10-03-2019 05:48 PM

That is partially it (local desktop or network share vs SQL on optimized SAN), also the latency of transferring data over the network. However SAS also appears to process incoming data slower (possibly due to packet size differences) and is unable to keep up with the packets (network sync wait type).

It is also odd that when the query is observed in MS SQL it is single threaded, the same query ran directly in sql would be parallel threaded.

ChrisNZ · Posted 10-03-2019 05:59 PM

Mmm interesting.

Maybe setting option TCPMSGLEN can help.

It is documented against SAS/SHARE and SAS/CONNECT only, but since it sets TCP buffer, it may well impact SAS/ACCESS as well.

Are you using ODBC or native access on the SAS client?

High-Performance SAS Coding - Third Edition

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Re: SAS: how to make it write proper SQL that do not degrade performance 10000% or more against MS S

Classroom Training Available!