BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
learn_SAS_23
Quartz | Level 8
Hello Team,

We are using db2 bulk loader (cliload)transformation while loading to our target database,

Every day we receive 10 to 9 million of rows to load into target table, and the target table is partitioned on monthly basis and indexes with date .

Loading usually takes 3 to 4 hours of time, if some other process uses the table in mean time it creates table lock and doesn’t allow the load process,

How to gain much much speed in loading process and how to solve the database locks

Please post your tips and tricks on this issue
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Using Bulk Load to create/populate the staging table is the right way to go, but from there all processing has to be done in the DB, and that means explicit pass-through, or an SQL job done exclusively from the same scheduler that initiated the SAS job.

 

Just to give you an alternate perspective: we create flat files from SAS, let the data center people (who control the main scheduler) upload the files to the DB server, and there run an insert that loads the data into the DB. Our SAS has (and does not need) any direct access to the DB.

View solution in original post

10 REPLIES 10
learn_SAS_23
Quartz | Level 8

am loading from DB2(stage table to target table) , both are in same unix server

Kurt_Bremser
Super User

Since you already have the data in the DBMS, use explicit pass-through in a user written transformation. Otherwise you run the risk of SAS loading the data back from the DBMS for the bulk load, causing double, unnecessary network traffic.

Patrick
Opal | Level 21

This sounds more like a DB2 than a SAS problem. If there is another process issuing a table lock then I guess it's nothing you can change. If it's your process requiring an exclusive table lock and some read process prevents this then there might be options.

 

What about loading the data first from SAS into a staging table in DB2 (=empty table structure with identical columns but with no partitioning, constraints or indexes). 

Then load the data within DB2 from the staging into the target table (using explicit pass-through SQL). Because this process is now fully within DB2 it will also be much easier to get support from a DB2 admin/expert.

 

It's certainly possible to insert into a table without the need of an exclusive table lock but which loading approach to chose will depend on your target table (i.e. are there also referential integrity, triggers etc. on this table).

 

learn_SAS_23
Quartz | Level 8
Thanks Patrick ,

Yes, we are using similar way of loading in to empty stage DB2 table and then loading to target table using sas DIS DB2 bulk loader transformation (CliLoad) . As target table already had huge amount of data , we are appending new rows to existing table.

Can you post bit more details which options to use in DB2 loader transformation to avoid table locks and allow other process to read existing data
Patrick
Opal | Level 21

Hmmm.... The only way I've ever used bulk load and DIS transformations allowing to bulk load were from external data source (i.e. a SAS table) into the target DBMS table. I never even considered to bulk-load from one DBMS table to another within the same DB - and I'm also not sure if this will be faster and especially if SAS generated code would execute such a process fully in-database or eventually first pull the source table to the SAS side and only then bulk-load from the SAS side to the target table.

 

What I had in mind is to first bulk-load your SAS table into an empty DBMS staging table to which you can have exclusive access and then as a 2nd step run an Insert from the staging to the target table - and there of course turning on parallelism, loading directly into a partition, commit once, control when indexes and statistics get updated,... or whatever else can speed-up the load process.

 

If you can't get the 2nd step - the Insert - to perform as you need then here you've got both source and target already in DB2 which allows you to "take SAS out of the picture" and get support from a DB2 expert. Any code that can be executed out of a DB2 client can then also get executed via SAS (in a Proc SQL Execute block as explicit pass-through SQL that just gets sent "as is" to the DB for execution).

In DIS this 2nd step would then likely become user written code or eventually a custom transformation.

 

I haven't worked with DB2 for many years so can't help much more with details. Here some docu links - the SAS options for CLI LOAD will be related in name to the parameters available within the DB2 utilities.

https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/acreldb/p1w9ial7770y6tn1rtfiuwtfg088.htm

https://www.ibm.com/docs/en/db2/11.5?topic=data-importing-cli-load

https://www.ibm.com/docs/en/db2/11.5?topic=load-options-improving-performance

 

3-4 hours to load 10M rows feels awfully long. It depends of course on the row size of your source table and how the target table looks like (i.e. foreign key constraints can slow down things a lot).

learn_SAS_23
Quartz | Level 8
Thanks Patrick and Kurt for your valuable suggestions and link references , they are insightful .
As Patrick noted that the table is huge with close to 60 columns and 11 foreign keys which slows down the process. we are using Bulkload from sas dataset to stage DB2 table and again bulk loader from stage DB2 table to target DB2 table. i have to try with inserting data with in DB2 and we are also checking with DB2 DBA to get some tips .


Kurt_Bremser
Super User

Using Bulk Load to create/populate the staging table is the right way to go, but from there all processing has to be done in the DB, and that means explicit pass-through, or an SQL job done exclusively from the same scheduler that initiated the SAS job.

 

Just to give you an alternate perspective: we create flat files from SAS, let the data center people (who control the main scheduler) upload the files to the DB server, and there run an insert that loads the data into the DB. Our SAS has (and does not need) any direct access to the DB.

learn_SAS_23
Quartz | Level 8
Thanks Kurt for the detailed information,
one of DB2 expert suggested similar way of handling this situation , means creating a flat files from sas with delta data and upload to DB2 table through the INGEST method (which doesn't create any table locks allows the user to read committed data ,while table is loaded with delta data)
Kurt_Bremser
Super User

All our DB to SAS communication is done with flat files.

  • the text files are kept in archive for future reference and always show the state of the DB when the snapshot for the unload was made
  • text files are always readable, on every platform
  • the DB admins keep full control over the time when the DB system will have to deal with the stress of bulk unload/insert
  • we save the cost of the ACCESS license
  • we do not need to keep a DB client on the SAS server (needs additional maintenance)

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2030 views
  • 9 likes
  • 3 in conversation