07-23-2013 04:36 PM
I am new to SAS and SAS communities! I have been trying to understand The best/most efficient way to query data from our MS SQL Server. Currently I am using SAS Access for ODBC
Here are my Observations:
If I use the query builder in a process flow and drag in two libraries, join them and retrieve a couple of columns from each, I see four threads open on my sql server - two per table. While I don't like this, I expect it and know that is the behaviour.
If I then add a filter to my query and hit run again, it opens four new threads!! I was unaware of this.
If I use a proc sql; statement and write the query manually, I reduce the number of threads to two. This is a plus. Apparently proc sql commands use a cursor which means I lose all of the caching/performance benefits of multi-threaded reads and SAS caching.
My questions are:
1> Is the SAS/ACCESS Interface to Microsoft SQL Server a better solution?? Better performance, fewer threads etc.
2> Is there a different SAS code option like proc sql on steroids??
I can change the user properties that SAS is using to kill the client thread after a smaller amount of inactivity but this seems like a poor use of resources on the part of the SASAPP server?
Thanks for any opinions/insights/feedback etc.
07-23-2013 05:03 PM
Shawn, welcome to the community! I'll supply some information that might help.
When you use the Query Builder, EG opens the tables (to get column info and such), so that interaction might account for some of the threads. These "open" actions are in addition to the activity that the PROC SQL program will generate. Theoretically the Query Builder will close the tables when you close the window...
Because you have two libraries interacting, you might be limiting the efficiency of PROC SQL. PROC SQL will try to push as much work to the database as possible, to limit the amount of data that has to come back to SAS. However, joins across two different libraries (and therefore different database connections) will probably not push down, so more records come back to SAS to perform the join.
You might try to use "explicit passthrough" -- an option in the Query Builder. Select "Options for this Query" and see if you can take advantage of the generated CONNECT statement. This ensures that all database operations occur in the SQL Server.
Because of the multiple libraries, you might need to write your own PROC SQL program to really take advantage of explicit passthrough.
And just in case it fits your situation, don't forget about the subquery approach:
07-24-2013 04:43 AM
"System Performance hardware/support cost" vs "ease of coding (man time cost)" have always been and will be a difficult subject with discussions.
It is always shifting in the direction of accepting more computer resource overhead (alwasy becoming cheaper) to the ease of coding.
At the moment of not being comfortable with the performance/behavior (Volume Velocity Variety) is is getting attention to do something.
I remember the use of Assembler (like C) vs using Cobol for prggrammers was a similar disussion. Indeed that is of ancient times.
The perfomance questions with SAS are covered/part of SAS advanced (coding) learning path. We are on the trail of using Eguide now with te click klik, an other shift.
I know: a/ Eguide is not blocking the old coding habits b/ all other SAs clients are evolving in a menu-style approach (Miner, Information map, Olap Cube studio, DI Studio VA etc)
The SAS/access modules are ver well documented. Most people don not review those as they are containing all the technical options and issues needing the technical knowledge and overview of the DBA.
Shareread globalread locking commit bulkload all have their advantaqges and trade offs. The default settings and options are chosen for a quick reliable ease start. See:
SAS/ACCESS(R) 9.4 for Relational Databases: Reference, Second Edition (CONNECTION= LIBNAME Option)
Reading the MS-SQL specifics of access: SAS/ACCESS(R) 9.4 for Relational Databases: Reference, Second Edition (LIBNAME Statement Specifics for Microsoft SQL Server) you have the dbsliceparm.
They are not that different as to ODBC specifics: SAS/ACCESS(R) 9.4 for Relational Databases: Reference, Second Edition (LIBNAME Statement Specifics for ODBC)
Here you have your steroids....
And more stuff
Using explicit SQL pass thru is the sceneario: you are telling you are better in SQL-coding than the menu-driven way.
By that it is offering the SQL specifics of the DBMS. Sounds terrible: SQL is not that standard as expected. SAS is using ANSI-SQL (unless you are going to use the new DS2 features).
The specfics of sql pass-thru is giving the option to access information in het DBMS in the MDL domain. (DDL normally restricted for the DBA, DML commonly open for user applications)
SAS/ACCESS(R) 9.4 for Relational Databases: Reference, Second Edition (SQL Pass-Through Facility Specifics for ODBC)
Have fun with all of this,