BookmarkSubscribeRSS Feed
JroeJroe
Obsidian | Level 7

As a MS SQL DB admin I often find SAS code running inefficiently on the server which will generate "network I/O sync" wait times (degraded performance), run serial/single threaded (not parallel) and the queries from SAS are rewritten poorly in SQL or inefficiently. This occurs primarily when any custom SAS logic is used (macros) and the same behavior occurs with PROCSQL and data steps. This is a simple example. 

 

Example a PROC SQL statement:

SELECT

  A,B , SUM(PAID) AS PAID

FROM X

WHERE A between 2000 and 3000

GROUP BY

A,B

 

 

What it should do is rewrite a similar query in SQL. However when you look at SQL this is how the thread will look like:

 

SELECT "A", "B",  "PAID"
FROM X
WHERE ( ( "A" between "2000" and "3000"))

 

- it is not performing a GROUP statement. So it ends up pulling ALL records serial/single threaded over the network (300mil rows) to the person's SAS workspace/PC THEN it performs the group by statement in house. The table is columnstore indexed and can be summarized in SQL in 5 seconds vs hours in SAS (since SAS wants to recreate EACH ROW on a local pc, over a network, single threaded...)

 

BULK INSERTS perform even worse with even more bizarre code observed in SQL . 


LIBNAME TEST ODBC NOPROMPT =
"Driver=SQL Server Native Client 11.0;
Server=myserver;
Database=myDB;
dbsliceparm=all;
threads=yes CPUCOUNT=8;
READBUFF=32767;
DBOMMIT=32767;
INSERTBUFF=32767;
ROWSET_SIZE=32767;
Trusted_Connection=Yes;"
Schema = TEST ;

 

36 REPLIES 36
34reqrwe
Quartz | Level 8

You can use Explicit passthrough if you want to write native MS SQL to run against the Database . 

JroeJroe
Obsidian | Level 7

do you have an example? I had googled that for a few hours last week and didn't get far. Also I'm not sure why SAS sometimes writes "good sql" and sometimes it writes bizarro slow sql that will take 10000%+ longer to process. Appreciate it!

JroeJroe
Obsidian | Level 7
so there is no way for proc sql and data steps to write normal/expected standard sql queries in a standard fashion other than passthrough? (like libname hints etc)?

its random and when SAS does it its often 1000x slower.
andreas_lds
Jade | Level 19

@JroeJroe wrote:

As a MS SQL DB admin I often find SAS code running inefficiently on the server which will generate "network I/O sync" wait times (degraded performance), run serial/single threaded (not parallel) and the queries from SAS are rewritten poorly in SQL or inefficiently. This occurs primarily when any custom SAS logic is used (macros) and the same behavior occurs with PROCSQL and data steps. This is a simple example. 

 

This is hardly possible (red marked text), because the macro-PRE-processor finished his job long before any code is send to MS SQL DB.

 


@JroeJroe wrote:

LIBNAME TEST ODBC NOPROMPT =
"Driver=SQL Server Native Client 11.0;
Server=myserver;
Database=myDB;
dbsliceparm=all;
threads=yes CPUCOUNT=8;
READBUFF=32767;
DBOMMIT=32767;
INSERTBUFF=32767;
ROWSET_SIZE=32767;
Trusted_Connection=Yes;"
Schema = TEST ;


 

The problems could be a consequence of using an odbc connection instead of a native mssql library. Do you have Access to MSSQL licenced?

ChrisNZ
Tourmaline | Level 20

1. If your query really is that simple, SAS should pass the GROUP BY clause to the database.

It is puzzling that is does not.

 

2. Learn how to use pass-through. It is really simple to implement and allows you full control of the code.

I have a hard time believing you "googled that for a few hours last week and didn't get far"

libname EXTDATA <connections parameters>;
proc sql;
  connect using EXTDATA;
  create table T as select * <SAS SQL here> from connection to EXTDATA 
     ( <native SQL here> )
  < more SAS SQL here like order by >
  ;  

Anything between the parentheses is native SQL, anything outside is SAS SQL.

 

3. The libname statement you are showing is SAS code, to point to your ODBC source. Nothing to do with SQL.

JroeJroe
Obsidian | Level 7

Back up a bit, trying to keep things simple for support and not pick a fight with folks. I am looking at simple code statements from a MS SQL DBA perspective that rarely uses sas (sas , lets say we have 10000 different projects, in no way can I have 500 people change thousands of lines of code to pass through sql) , but I will test it since I think its a super handy tool for when "Speed is important", in those cases we simply replace SAS with SQL and process stuff 1000x faster anyway. What I would like to see is SAS handle simple queries in a simple manner in PROC SQL or DATA steps when communicating with MS SQL which is why I'm hoping to find out what is causing the odd processing methodologies observed in SSMS from SAS code. 

 

Andrea Could you help provide an example of:

"The problems could be a consequence of using an odbc connection instead of a native mssql library. Do you have Access to MSSQL licenced?"

 

Yes I have both SQL Enterprise and SQL Standard. Both handle some (mostly all ) SAS code in the same bizarre manner. However sometimes it runs good sql/normal and it is fine. I'm ok with switching how the libname is configured thanks!

andreas_lds
Jade | Level 19

You need to have SAS Access to MSSQL licenced, you can check with proc setinit;run;

JroeJroe
Obsidian | Level 7

thanks! I dont see MSSQL being specified, do you have an example of what it looks like? There are a ton of other items. Also would you have an example libname then i'd have to use after?

 

 

---Base SAS Software
30NOV2019 (CPU A)
---SAS/STAT
30NOV2019 (CPU A)
---SAS/GRAPH
30NOV2019 (CPU A)
---SAS/ETS
30NOV2019 (CPU A)
---SAS/FSP
30NOV2019 (CPU A)
---SAS/OR
30NOV2019 (CPU A)
---SAS/AF
30NOV2019 (CPU A)
---SAS/IML
30NOV2019 (CPU A)
---SAS/QC
30NOV2019 (CPU A)
---SAS/CONNECT
30NOV2019 (CPU A)
---SAS/EIS
30NOV2019 (CPU A)
---SAS/GIS
30NOV2019 (CPU A)
---SAS Enterprise Miner
30NOV2018 (CPU A)
---MDDB Server common products
30NOV2019 (CPU A)
---SAS/Secure 168-bit
30NOV2019 (CPU A)
---SAS/Secure Windows
30NOV2019 (CPU A)
---SAS Text Miner
30NOV2018 (CPU A)
---SAS Enterprise Guide
30NOV2019 (CPU A)
---OR OPT
30NOV2019 (CPU A)
---OR PRS
30NOV2019 (CPU A)
---OR IVS
30NOV2019 (CPU A)
---OR LSO
30NOV2019 (CPU A)
---SAS/ACCESS Interface to PC Files
30NOV2019 (CPU A)
---SAS/ACCESS Interface to ODBC
30NOV2019 (CPU A)
---SAS/ACCESS Interface to OLE DB
30NOV2019 (CPU A)
---SAS Enterprise Miner for Desktop
30NOV2018 (CPU A)
---SAS/IML Studio
30NOV2019 (CPU A)
---SAS Workspace Server for Local Access
30NOV2019 (CPU A)
---High Performance Suite
30NOV2019 (CPU A)
---SAS Add-in for Microsoft Excel
30NOV2018 (CPU A)

ChrisNZ
Tourmaline | Level 20

1. Code translation

If this helps, you can see the statement that SAS sends to the database by using options SASTRACE.

SAS usually does a good job of passing most clauses to the database.

ODBC is generic, so a native engine is better in that respect as highlighted by @andreas_lds .

 

2. Pass-thru

As soon as SAS-specific function are used, the process cannot be passed to the database. 

So something like

SELECT
  A, put(B,z2.) as B, SUM(PAID) AS PAID
FROM X
WHERE A between 2000 and 3000
GROUP BY 1,2

will run entirely in SAS.

 

3. Performance

Which is faster varies. For example, since SQL databases do not maintain sorted tables, full table joins are usually very much faster in SAS on similar hardware.

 

 

jroe
Obsidian | Level 7

I haven't been able to get SAS to consistently write standard code ( I know macro variables and other functions can lead to abnormal code which I often advise folks not to use when talking to SQL).

 

Below is what I did to get bulkload to use tablock (simplified logging) which should stop SAS from creating 2TB files.

 

 

LIBNAME SAS ODBC NOPROMPT = 
"Driver=SQL Server Native Client 11.0;
Server=MYSERVER; 
BULKLOAD=YES;
BL_OPTIONS=TABLOCK;
READBUFF=100000;
DBOMMIT=300000;
INSERTBUFF=32767;
ROWSET_SIZE=32767;
UPDATE_LOCK_TYPE=TABLE;
Trusted_Connection=Yes;
Database=MYDATABASE;
dbsliceparm=all;
threads=yes;
CPUCOUNT=12;
" 
Schema = dbo;

 

Data 
	SAS.MYNEWTABLE (BULKLOAD=YES BL_RECOVERABLE=NO DBCOMMIT=300000 INSERTBUFF=32767 UPDATE_LOCK_TYPE=TABLE READ_LOCK_TYPE=TABLE); 

set MYSASDATA;
run;

 

 

ChrisNZ
Tourmaline | Level 20

Using ODBC is unlikely to support any smart parameters or specific settings.

It is a dumb pipe that's very useful for generic connectivity, but generally unsuited for large volumes or when more control is required.

I see that you also have SAS/ACCESS to OLEDB. You can try this interface though the limitations are possibly similar.

Other options:

- Check that the native SAS/ACCESS interface does what you need, test it and licence it

- Use flat files

jroe
Obsidian | Level 7
Do you have a sample OLEDB connection libname statement to MS SQL server? I tried the guide but wasn't able to get it to work. Thanks!
ChrisNZ
Tourmaline | Level 20

> Do you have a sample OLEDB connection libname statement to MS SQL server?

Sorry I don't

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 36 replies
  • 2733 views
  • 19 likes
  • 8 in conversation