Re: What is the best way to replace a DB2 DXT step with SAS?

Elastic · Posted 02-02-2015 10:34 PM

Hi, I need to migrate a SAS batch job from z/OS to run on Linux system. The job consists of a DB2/DXT step that extracts selected columns from a large DB2 table on z/OS to a flat file before passing the file on to the SAS step for further processing:

//STEP1 EXEC PGM=IKJEFT01,REGION=4096K,TIME=100

//SYSPRINT DD DSN=ABC.DB2LOG,DISP=MOD

//SYSOUT DD SYSOUT=*

//SYSTSPRT DD SYSOUT=*

//SYSABEND DD SYSOUT=*

//SYSUDUMP DD SYSOUT=*

//SYSDUMP DD SYSOUT=*

//DSNTRACE DD SYSOUT=*

//SYSTSIN DD *

PROF PREF(JBGM)

DSN SYSTEM(DSN)

RUN PROGRAM(DVRE0000) -

PLAN(DXT) -

LIBRARY('ABC.DXTLOAD')

END

//SYSIN DD *,DLM=$$

SUBMIT EXTID=JBGMX,

DBS=DB2,FORMAT=EBCDIC,EXTDATA=XDATA,CD=JCS,JCS=JCSOUT

EXTRACT

SELECT COLA, COLB, COLC

FROM DB100.TBL100

WHERE COLA = 'B' ;

$$

/*

//XDATA DD DSN=ABC.TBL100,DISP=(NEW,CATLG,DELETE),

// UNIT=SCRTCH,SPACE=(TRK,(500,50),RLSE),

// DCB=(RECFM=FB,LRECL=100,BLKSIZE=28000),

// VOL=(,,,10)

//JCSOUT DD *

*CD

/*

What would be the best way (in terms of performance) to convert this job so the DXT step can be eliminated and replaced when running on Linux server? By creating a SAS DATA step view, SQL view, or a SAS/Access view like below?

//STEP1 EXEC SAS,TIME=100

......

//TEMP DD DSN=ABC.TBL100,DISP=(,CATLG,CATLG),

// UNIT=SCRTCH,SPACE=(TRK,(200,100),RLSE),

// DCB=(DCO01.DSCB)

.......

//SYSIN DD *

PROC SQL;

CONNECT TO DB2 (SSID=DSN);

CREATE TABLE TEMP.TBL100 AS

SELECT * FROM CONNECTION TO DB2

(SELECT COLA COLB COLC

FROM DB100.TB100

WHERE COLA = 'B');

%PUT SQLXRC=**&SQLXRC** SQLXMSG=**&SQLXMSG**;

QUIT;

Patrick · Posted 02-04-2015 03:14 AM

The example code you've posted doesn't create a SAS SQL view but a SAS table. If you want to use the data in a consecutive data step and you have to access the source data only once then I would eventually go for a SQL view (create view ....).

I've made the experience that the default installation options set for SAS/Access are always not optimal (so that they work in any environment). I would assume that you get the highest performance gains by tweaking these options (eg. readbuff).

SAS/ACCESS(R) 9.4 for Relational Databases: Reference, Sixth Edition

Not sure if DB2 allows for multithreaded read - but this would be something to investigate.

Gosh, it's so long ago, but I believe to remember that "dirty reading" also improved performance a lot so if that's acceptable add "with UR" to your DB2 specific SQL code block.

If you want to tweak things further then you could also "sync" the "receiving" SAS data set with what's coming from the DB2 table, so data set options like "blocksize" and the like could improve performance of moving data from the data base to a SAS table.

Elastic · Posted 02-05-2015 03:38 PM

Thank you for the input.

I have revised the example as follow:

LIBNAME DB100 DB2 SSID=DSN AUTHID=ABC

DIRECT_SQL=YES READBUFF=200;

PROC SQL;

CREATE VIEW TCP103 AS

SELECT COLA AS CUSTNAME,

COLB AS CUSTNO,

COLC AS ADDR

FROM DB100.TB100

WHERE COLB > 100;

RUN;

As this DB2 table is very large that the DXT step would extract over 4.8M rows from it, would the above sample SAS code in creating a SAS SQL view be the most efficient way in replacing the DXT step? When the subsequent step processes the SAS view, it will take as long to access DB2 and scan the large table for the required data manipulation, right? So what would be the advantage of creating a view in comparing to creating a SAS table in my case?

As the DB2 table is accessed by many applications, I am afraid the "dirty reading" with UR is not acceptable in my case.

I will investigate into the SAS/Access options, the DB2 multi-threaded read, and your last point about "syncing' SAS data set with DB2 table (or may not apply in my example above since only temp SAS view is created).

Any further input in helping to optimize this job is very much appreciate. Thanks again for all the good comments in your last reply!

arronlee · Posted 02-06-2016 02:10 AM

Hi, Elastic.

Thanks for sharing your problem. As for me, I have seldom tried to migrate a SAS batch job from z/OS to run on Linux system before. Have you ever worked it out? I wonder whether you have any exprience about pdf extraction process. Because there is something wrong with my pdf reader. I need convert pdf into text or other formats. Any suggestion will be appreciated. Thanks in advance.

Best regards,

Lee

What is the best way to replace a DB2 DXT step with SAS?