Whenever reading data from a database and moving the result set to SAS you want to concentrate on 2 things:
1. Push processing to the database and reduce data volumes as much as possible on the DB side
2. Move the data as efficiently over the network as possible
For 1:
If you're not using explicit pass-through SQL then use OPTIONS SASTRACE=',,,d' SASTRACELOC=SASLOG NOSTSUFFIX; before you read from the DB. This will tell you in the SAS log which part of your SAS statement the SAS/Access engine could push to the database.
If you use dataset options and functions that SAS supports for pushing into the database then the SAS/Access engine will convert these instructions into DB2 SQL and send it to the database for execution.
I personally prefer to use SQL when interfacing with a database but SAS datastep code as below should work as well.
/* option 1 */
data temp17_ar_1;
set potbnd.ar(keep=src_stm_id where=(src_stm_id in (1,30,42,48,49)));
run;
/* option 2 */
proc sql;
create table temp17_ar_1 as
select src_stm_id
from potbnd.ar
where src_stm_id in (1,30,42,48,49)
;
quit;
For 2:
The bottleneck is very often the data transfer from the DB to SAS. There are many things out of your control here BUT the one thing you can change easily: Libname option READBUFF defines how many rows get transferred at once. If this options is not set explicitly then the default is 1. Increasing this value significantly leads often to much better performance.
Thank you @Patrick , Awesome points. Pity that I will have to wait till Monday to try that out alongside with help from admin. I request your patience if I may have any follow up question once I attempt interacting with the team/admin. Cheers!
I like your passion.
Here a few more things worth knowing.
You can use the library meta engine to assign a library that is defined in metadata as not pre-assigned (the white nodes in EG) using code as below:
/*
assign library defined in metadata that is not pre-assigned
- explicitly add option readbuff
*/
%let lref=potbnd;
libname &lref meta liburi="SASLibrary?@Libref='&lref'" metaout=data readbuff=100000;
libname &lref list;
The SAS/Access Engine can't push all SAS functions to DB2. The list of functions that can get pushed is here .
If you use any other function in a where clause or join condition then SAS will first load the data into SAS and only execute the function within SAS. The code will still work but performance will degrade because more data gets transferred over the network.
Explicit pass-through SQL lets you write SQL in the database flavor. So here SAS will just send the code as-is to the database. This allows you full control what gets sent to the DB and you can also use DB functionality not available via implicit SQL or SAS datastep code.
/* explicit pass-through SQL using a library definition to connect to a database */
proc sql;
connect using &lref as mycon;
select * from mycon as
(
select src_stm_id
from OTBND.ar
where src_stm_id in (1,30,42,48,49)
with UR
)
;
disconnect from mycon;
quit;
You've posted the connection info for you library....
General DB2-POTBND Type:Library Server: SASApp Engine: DB2 Location: /Shared Data/DB2-POTBND Options: schema='OTBND' READ_ISOLATION_LEVL=UR Datasrc=MODPOTBD Libref: POTBND Read Only :No Temporary: No
...and it looks like the DB2 schema you need is already defined in your libary. I believe you therefore could omit the DB2 schema name in the DB2 SQL part. The advantage of a one level name: If the DB2 schema ever changes then the only place which requires amendment is the metadata libname definition.
..... ( select src_stm_id from ar where src_stm_id in (1,30,42,48,49) with UR ) .....
What above also demonstrates: A SAS Libname connects to a single database schema and that's where your code executes. If you want to join tables from different schemas then you would need two different libnames - but in such a case SAS can only push schema specific filter conditions to the DB. It can't push conditions which require access to both schemas (as SAS uses two different connections/libnames).
If you use explicit pass-through SQL then you can use a single libname and have the join in-database. You just need to use a two level name for your tables with the DB2 schema name as first part. This should work as long as the user connecting to the DB can access both schemas.
And last but not least:
I'm not sure that what I've said about READBUFF applies to DB2. Still worth trying though. From the docu:
DB2 under UNIX and PC Hosts
If you do not specify a value for this option, the default buffer size is automatically calculated based on the row length of your data. The SQLExtendedFetch API call is used
Have fun!
Good morning/Good evening Sir @Patrick @down under. Thank you for all the details. I just got in and here is the test results
22 GOPTIONS ACCESSIBLE;
23 /*testing when yellow is unassigned on the left hand side in SAS EG*/
24
25 %let lref=potbnd;
26 libname &lref meta liburi="SASLibrary?@Libref='&lref'" metaout=data readbuff=100000;
________
22
ERROR: Libref POTBND is not assigned.
ERROR: Error in the LIBNAME statement.
ERROR 22-7: Invalid option name READBUFF.
27 libname &lref list;
ERROR: Libref POTBND is not assigned.
ERROR: Error in the LIBNAME statement.
23 /*testing when yellow is assigned on the left hand side in SAS EG*/
24 %let lref=potbnd;
25 libname &lref meta liburi="SASLibrary?@Libref='&lref'" metaout=data readbuff=100000;
________
22
ERROR: Libref POTBND is not assigned.
ERROR: Error in the LIBNAME statement.
ERROR 22-7: Invalid option name READBUFF.
26 libname &lref list;
ERROR: Libref POTBND is not assigned.
ERROR: Error in the LIBNAME statement.
23 proc sql;
24 connect using potbnd as mycon;
ERROR: A Connection to the META DBMS is not currently supported, or is not installed at your site.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
25 select * from mycon as
26 (
_
22
76
ERROR 22-322: Expecting a name.
ERROR 76-322: Syntax error, statement will be ignored.
27 select src_stm_id
28 from OTBND.ar
29 where src_stm_id in (1,30,42,48,49)
30 with UR
31 )
32 ;
33 disconnect from mycon;
NOTE: Statement not executed due to NOEXEC option.
34 quit;
NOTE: The SAS System stopped processing this step because of errors.
Any thoughts/points that I can take to the admin plz?
Then remove READBUFF= option, maybe it is not supported by DB2 .
And did you try my 'connect to db2(datasrc=.....)' ?
Thank you Xia. My apologies that I didn't post that log too. Forgive me, just slipped through the cracks. Here it is
22 GOPTIONS ACCESSIBLE;
23 proc sql;
24
25 connect to db2(Datasrc=MODPOTBD schema='OTBND' readbuff=10000) ;
ERROR: Invalid option name schema.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
26
27
28
29 create table want as
30
31 select *
32
33 from connection to db2 (select * from ar where src_stm_id in (1,30,42,48,49) );
NOTE: Statement not executed due to NOEXEC option.
34
35 quit;
proc sql;
25
26 connect to db2(Datasrc=MODPOTBD schema='OTBND' ) ;
ERROR: Invalid option name schema.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
27
28
29
30 create table want as
31
32 select *
33
34 from connection to db2 (select * from ar where src_stm_id in (1,30,42,48,49) );
NOTE: Statement not executed due to NOEXEC option.
35
36 quit;
Try to remove option SCHEMA= .
Then just try first to assign the libref using the meta engine without the READBUFF option. That should work.
24 %let lref=potbnd;
25 libname &lref meta liburi="SASLibrary?@Libref='&lref'" metaout=data;
If it doesn't: I remember having encountered a case where using the META engine didn't work for a metadata libname definiton for Hadoop. The reason was some quoting issue in the libname metadata definition which somehow SAS overcome when assigning the library directly but not when using the meta engine. Once we've fixed the libary metadata definition things worked as expected.
Eventually add option LIBDEBUG to the libname statement to get more log information.
I didn't have an environment where I could test if adding an option to a library definition pointing to a database works. I did test it for a library using the BASE engine and there it worked.
@Patrick Thank you so much. This
25 %let lref=potbnd;
26 libname &lref meta liburi="SASLibrary?@Libref='&lref'" metaout=data;
NOTE: Libref POTBND was successfully assigned as follows:
Engine: META
Physical Name: MODPOTBD
worked 🙂
@Ksharp Thank you,
24 proc sql;
25 connect to db2(Datasrc=MODPOTBD) ;
ERROR: CLI error trying to establish connection: [IBM][CLI Driver] SQL30082N Security processing failed with reason "3" ("PASSWORD
MISSING"). SQLSTATE=08001
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
26 create table want as
27 select *
28 from connection to db2 (select * from ar where src_stm_id in (1,30,42,48,49) );
NOTE: Statement not executed due to NOEXEC option.
Anyways to get the SQL-pass through to work?
Anyways to get the SQL-pass through to work?
You can either use the connection already established via a libname:
connect using &lref as mycon;
Or you define a new connection but then you need to fully define it passing in all the information which was also required to define the libname.
@novinosrin - Do you have SAS administrators at your site who can guide you regarding database connections? If so then they can likely provide far quicker answers than us who have to guess how your connections could be configured.
Good morning/evening @SASKiwi How I wish the so called administrators were even responsive in the 1st place. A formal ticket through technology help desk either remains unresolved or closed without attention. On top of that, the advice I get is run the junk process late at night or in the weekends.
Yes indeed, I have escalated this with my boss and hopefully he will do something about it "yet". To answer my frustration, I combined many steps of the process wrote by "somebody" back in 2015 to something advanced like multiple hash/array(smaller tables), barring one dataset read from disk to get things going for me, however even the very syntax/approach got a push back from the team asking me to keep it simple for the sake of the team's comfort and rather have fun on "our communities" with such codes.
In essence, there seems a no-win whatsoever though my manager seems to acknowledge the good suggestions that I give, albeit I wouldn't be surprised if I get another advice saying "work for money, learn for passion". That is reasonable coz as i graduated from DePaul, I got much more than what I asked for. However, it's my self ethic that wants me to contribute more and I am afraid it's not happening easily. Henceforth, I am being cheeky in asking for your help here to charm my intent and luck.
THe LOG said you need offer password . Can you ask your Admin for PW?
connect to db2(Datasrc=MODPOTBD password=xxxx ) ;
Thank you all for chiming in this thread and helping the best way you could. I just got a response from the admins, of course an honest response with their admission that they tried and they couldn't get the SQL pass through to get up and running. Of course, one could take up by logging a ticket with SAS Tech Support, however I personally have decided not to suggest them to pursue for the reason I have already gotten the vibe most teams here and their SAS users are not particularly interested in High performance solutions/sophisticated programming solutions that may involve any part of he architecture/design for that matter.
I wish I could mark one answer and ethically it wouldn't be fair. However, I want your trust that how much I valued your time for helping me out more so when the thread gotten longer. Thank you again! Cheers!
If you've got already a working libname then are you telling us that even the syntax as below doesn't work?
connect using &lref as mycon;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.