BookmarkSubscribeRSS Feed
novinosrin
Tourmaline | Level 20

Hello Folks,

 

My team and I have a task in reading DB2 tables for a process. I am not sure how the architecture is set up. Basically, I have been given access to read DB2 tables with a libname. And the following took

23         data temp17_ar_1 ;
24         set potbnd.ar(keep=src_stm_id) ;
25         where src_stm_id in (1,30,42,48,49);
26         run;

NOTE: There were 55749594 observations read from the data set POTBND.AR.
NOTE: The data set WORK.TEMP17_AR_1 has 12757931 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           25:29.08
      cpu time            5:46.61

Okay, some of you may think that's fine. I hope you don't. I was once  part of short term project where there was a need a access DB2 tables in a different organisation. The table sizes are comparable to the one I am currently reading. However, The execution took less than half the time. Come on, we can do better here too. Please advice. Thank you & Regards!

39 REPLIES 39
Kurt_Bremser
Super User

First of all, it seems that the where condition is not handed off to the DB.

And then you need to check observation numbers * observation size against the available network bandwidth between the servers.

After all, 4 fifths of the time SAS was waiting for data.

novinosrin
Tourmaline | Level 20

Thank you Sir @Kurt_Bremser  for the response. I might sound silly and dumb as I admit I have no knowledge on this. Henceforth, I request your patience. However, I am gonna consolidate your points and hopefully other's(who may chime) in this thread and take it forward to the authorities concerned

 

BD2 tables of interest. Proc contents isn;t giving me the needed metadata info. So i did a 

select count(*) and got this

POTBND DB2 Tables

Record_count

AR

55,749,594

AR_TVR_PRFL

13,984,214

AR_X_AR

15,977,287

FNC_SVC_AR

21,052,845

MSR_PRD

52,258

 

Not sure how I get this info-" available network bandwidth between the servers."

 

After all, 4 fifths of the time SAS was waiting for data.-  That's no good right? 😞

 

 

 

 

Kurt_Bremser
Super User

As soon as you move data over a network connection, that will be the bottleneck. The goal is to maximise the (logical) throughput. So handing off the processing of data-reducing actions (where, keep) to the DB is #1.

Your network admins will be able to tell you what the maximum expectable bandwidth of your connection is, and when you compare that to your actual throughput, you'll see if further improvement is possible at all.

Your DB admins should be able to provide you with table metadata (column types and other attributes), from which you can calculate observation sizes.

novinosrin
Tourmaline | Level 20

Thank you Sir @Kurt_Bremser  I will take that to DB2 Admins and other concerned folks. In general, once the SAS/ACCESS is established to DB2, could the engine not give us the performance we need, even if it takes a little hit but not to the extent of what it shows in the above LOG. Am i being reasonable to have this expectation?

 

here is the exact version info from the AIX server,
-------------------------------------------------------------------------------
Host: r64
Product: db2
Version: 9.4
Display Name: SAS/ACCESS Interface to DB2
Display Version: 9.4_M3

Ksharp
Super User
I don't know how you connect to DB2.
But try some options like :

libname potbnd DB2 datasrc=xxxx readbuff=10000 bulkload dbcommit=10000;

AND

proc copy .....


Ksharp
Super User
I don't know how you connect to DB2.
But try some options like :

libname potbnd DB2 datasrc=xxxx readbuff=10000 bulkload dbcommit=10000;

AND

proc copy .....


novinosrin
Tourmaline | Level 20

Thank you Xia. All I do is, login to SAS EG, on the left hand side, in the library list where I just right click on the name, the pop asks to assign or unassign. Once I click assign, it turns "Yellow" stating it's assigned

SASKiwi
PROC Star

@novinosrin  - To properly test DB2 connection performance you will need to learn to define your own database connections so you can experiment with different options. Just look at the Properties of your existing DB2 connections to figure out how to do this.

 

I suggest you also need to try SQL passthru where you can be sure everything is running the database. Just all part of the learning process for using external databases!

novinosrin
Tourmaline | Level 20

Sir @SASKiwi Thought of the angel, I was gonna plug you. Just felt shy to get cheeky. 

 

1.Properties of your existing DB2 connections 

 

How can i do that? Is contacting DB2 folks the only way? or can it be done pro grammatically?If yes, may i get some examples?

 

2.  For SQL pass through, would CONNECT work once the SAS EG assign through libname is already established?

SASKiwi
PROC Star

@novinosrin - In EG select a DB2 connection in your Server List, right click and select Properties. Most of what you need in a LIBNAME statement should be shown after Options.

 

Regarding SQL passthru CONNECT: You can refer to a LIBNAME here by doing: CONNECT USING <libref>.

novinosrin
Tourmaline | Level 20

Sir @SASKiwi  Thank you for the patience. This is what I see in properties

 

General DB2-POTBND
Type:Library
Server: SASApp
Engine: DB2
Location: /Shared Data/DB2-POTBND

Options: schema='OTBND'
READ_ISOLATION_LEVL=UR
Datasrc=MODPOTBD

Libref: POTBND
Read Only :No
Temporary: No

Ksharp
Super User

Here are two options . I recommend to use @SASKiwi  's pass through .

 

1)

proc sql;

connect to db2(Datasrc=MODPOTBD schema='OTBND' readbuff=10000) ;

 

create table want as

select *

 from connection to db2 (select * from xxxxxx  where ............. );

quit;

 

 

 

2)

libname x db2 Datasrc=MODPOTBD schema='OTBND' readbuff=10000 ;

proc sql;

create table want as

 select *

 from x.xxxxxx

  where ................;

quit;

 

 

 

 

 

novinosrin
Tourmaline | Level 20

Thank you @Ksharp  and @SASKiwi , Let me try and update you how that goes. More importantly, Thank you for your patience and tolerance for all silly and basic stuff but that is something I am zero. 

novinosrin
Tourmaline | Level 20

Hi @Ksharp  You have given me some smile



24         proc copy in=potbnd out=work memtype=data;
25            select ar;
26         run;

NOTE: Copying POTBND.AR to WORK.AR (memtype=DATA).
NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used.
NOTE: There were 55749594 observations read from the data set POTBND.AR.
NOTE: The data set WORK.AR has 55749594 observations and 76 variables.
NOTE: Compressing data set WORK.AR decreased size by 74.12 percent. 
      Compressed is 327922 pages; un-compressed would require 1267037 pages.
NOTE: PROCEDURE COPY used (Total process time):
      real time           26:03.24
      cpu time            8:37.57

Much better than the datastep. Only thing is I wish I could apply a FILTER

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 39 replies
  • 1725 views
  • 6 likes
  • 5 in conversation