Hello Folks,
My team and I have a task in reading DB2 tables for a process. I am not sure how the architecture is set up. Basically, I have been given access to read DB2 tables with a libname. And the following took
23 data temp17_ar_1 ; 24 set potbnd.ar(keep=src_stm_id) ; 25 where src_stm_id in (1,30,42,48,49); 26 run; NOTE: There were 55749594 observations read from the data set POTBND.AR. NOTE: The data set WORK.TEMP17_AR_1 has 12757931 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 25:29.08 cpu time 5:46.61
Okay, some of you may think that's fine. I hope you don't. I was once part of short term project where there was a need a access DB2 tables in a different organisation. The table sizes are comparable to the one I am currently reading. However, The execution took less than half the time. Come on, we can do better here too. Please advice. Thank you & Regards!
First of all, it seems that the where condition is not handed off to the DB.
And then you need to check observation numbers * observation size against the available network bandwidth between the servers.
After all, 4 fifths of the time SAS was waiting for data.
Thank you Sir @Kurt_Bremser for the response. I might sound silly and dumb as I admit I have no knowledge on this. Henceforth, I request your patience. However, I am gonna consolidate your points and hopefully other's(who may chime) in this thread and take it forward to the authorities concerned
BD2 tables of interest. Proc contents isn;t giving me the needed metadata info. So i did a
select count(*) and got this
POTBND DB2 Tables |
Record_count |
AR |
55,749,594 |
AR_TVR_PRFL |
13,984,214 |
AR_X_AR |
15,977,287 |
FNC_SVC_AR |
21,052,845 |
MSR_PRD |
52,258 |
Not sure how I get this info-" available network bandwidth between the servers."
After all, 4 fifths of the time SAS was waiting for data.- That's no good right? 😞
As soon as you move data over a network connection, that will be the bottleneck. The goal is to maximise the (logical) throughput. So handing off the processing of data-reducing actions (where, keep) to the DB is #1.
Your network admins will be able to tell you what the maximum expectable bandwidth of your connection is, and when you compare that to your actual throughput, you'll see if further improvement is possible at all.
Your DB admins should be able to provide you with table metadata (column types and other attributes), from which you can calculate observation sizes.
Thank you Sir @Kurt_Bremser I will take that to DB2 Admins and other concerned folks. In general, once the SAS/ACCESS is established to DB2, could the engine not give us the performance we need, even if it takes a little hit but not to the extent of what it shows in the above LOG. Am i being reasonable to have this expectation?
here is the exact version info from the AIX server,
-------------------------------------------------------------------------------
Host: r64
Product: db2
Version: 9.4
Display Name: SAS/ACCESS Interface to DB2
Display Version: 9.4_M3
I don't know how you connect to DB2. But try some options like : libname potbnd DB2 datasrc=xxxx readbuff=10000 bulkload dbcommit=10000; AND proc copy .....
I don't know how you connect to DB2. But try some options like : libname potbnd DB2 datasrc=xxxx readbuff=10000 bulkload dbcommit=10000; AND proc copy .....
Thank you Xia. All I do is, login to SAS EG, on the left hand side, in the library list where I just right click on the name, the pop asks to assign or unassign. Once I click assign, it turns "Yellow" stating it's assigned
@novinosrin - To properly test DB2 connection performance you will need to learn to define your own database connections so you can experiment with different options. Just look at the Properties of your existing DB2 connections to figure out how to do this.
I suggest you also need to try SQL passthru where you can be sure everything is running the database. Just all part of the learning process for using external databases!
Sir @SASKiwi Thought of the angel, I was gonna plug you. Just felt shy to get cheeky.
1.Properties of your existing DB2 connections
How can i do that? Is contacting DB2 folks the only way? or can it be done pro grammatically?If yes, may i get some examples?
2. For SQL pass through, would CONNECT work once the SAS EG assign through libname is already established?
@novinosrin - In EG select a DB2 connection in your Server List, right click and select Properties. Most of what you need in a LIBNAME statement should be shown after Options.
Regarding SQL passthru CONNECT: You can refer to a LIBNAME here by doing: CONNECT USING <libref>.
Sir @SASKiwi Thank you for the patience. This is what I see in properties
General DB2-POTBND
Type:Library
Server: SASApp
Engine: DB2
Location: /Shared Data/DB2-POTBND
Options: schema='OTBND'
READ_ISOLATION_LEVL=UR
Datasrc=MODPOTBD
Libref: POTBND
Read Only :No
Temporary: No
Here are two options . I recommend to use @SASKiwi 's pass through .
1)
proc sql;
connect to db2(Datasrc=MODPOTBD schema='OTBND' readbuff=10000) ;
create table want as
select *
from connection to db2 (select * from xxxxxx where ............. );
quit;
2)
libname x db2 Datasrc=MODPOTBD schema='OTBND' readbuff=10000 ;
proc sql;
create table want as
select *
from x.xxxxxx
where ................;
quit;
Hi @Ksharp You have given me some smile
24 proc copy in=potbnd out=work memtype=data;
25 select ar;
26 run;
NOTE: Copying POTBND.AR to WORK.AR (memtype=DATA).
NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used.
NOTE: There were 55749594 observations read from the data set POTBND.AR.
NOTE: The data set WORK.AR has 55749594 observations and 76 variables.
NOTE: Compressing data set WORK.AR decreased size by 74.12 percent.
Compressed is 327922 pages; un-compressed would require 1267037 pages.
NOTE: PROCEDURE COPY used (Total process time):
real time 26:03.24
cpu time 8:37.57
Much better than the datastep. Only thing is I wish I could apply a FILTER
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.