SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

HOW TO IMPLEMENT ORACLE CDC TRANSFORMATION FOR DAILY INCREMENTAL LOAD IN SAS DI STUDIO.

Reply
New Contributor
Posts: 3

HOW TO IMPLEMENT ORACLE CDC TRANSFORMATION FOR DAILY INCREMENTAL LOAD IN SAS DI STUDIO.

[ Edited ]

HI,

Actually i had 4 crores of records and i built the SAS DI job for historic run it has taken 10 hours. since its historic it is one time run.

The problem is I have to build the job for incremental for daily run. so for daily incremental run i have to pick only the changed records from souce and run the job so that i can do my daily run within 2 hours approximately. so i want to implenent ORACLE CDC TRANSFORMATION after source table. Please help me how to go a head with this transformation step by step. and how to select the options inside CDC Transformation and how to use Control table if it is necessary. Plz try to help me. THANKS IN ADVANCE.

Respected Advisor
Posts: 4,173

Re: HOW TO IMPLEMENT ORACLE CDC TRANSFORMATION FOR DAILY INCREMENTAL LOAD IN SAS DI STUDIO.

[ Edited ]

To use CDC there are most likely also changes on the Oracle side required (creating CDC records on the detail level you need it).

 

Before you jump for a CDC approach: Are there other ways which would allow you to identify deltas in source? For example is there something like a valid_from_dttm, load_dttm, update_dttm, load/batch_id column which you could use?

 

Are both the source and the target table in Oracle? And if yes: Are they on the same Oracle server?

 

How are your source and target tables structured and how do you need to load the target (Insert, update, delete, SCD2,....?).

Normally the delta load into target is the step where you must pay the most attention to get it right from a performance perspective.

New Contributor
Posts: 3

Re: HOW TO IMPLEMENT ORACLE CDC TRANSFORMATION FOR DAILY INCREMENTAL LOAD IN SAS DI STUDIO.

valid_from_dttm, load_dttm, update_dttm, load/batch_id column columns are not available in source.
Source and target table are in Oracle.Both other are on the same server. I want to load the target table by SCD2.
Respected Advisor
Posts: 4,173

Re: HOW TO IMPLEMENT ORACLE CDC TRANSFORMATION FOR DAILY INCREMENTAL LOAD IN SAS DI STUDIO.

[ Edited ]

With CDC you would have to cater for occurences where wrong data got loaded into your target, then this data got backed out and then correct data got loaded (things which happen in a real live scenario).

 

Is there ANY other column in your target which would allow you to identify deltas without having to compare source to target?

 

For example if your source table gets only inserts then something like a sequence number (i.e. primary key) could very well serve the purpose.

 

You will need to describe a bit better what you have in order for us to give you concrete advice; else it will be "RTM"

 

Are you loading into a OOB table of one of the SAS solutions? If so can you please tell us which solution, datamodel & table?

 

 

 

New Contributor
Posts: 3

Re: HOW TO IMPLEMENT ORACLE CDC TRANSFORMATION FOR DAILY INCREMENTAL LOAD IN SAS DI STUDIO.

Actually my source table contains History + new records . so I have to pick only updated and newly inserted records from source table.

(Actual scenario is I have taken 50 columns from source and after applying all the required logics now I got 75 columns (50 source columns + 25 newly generated columns).so I have 75 columns in history load. So the same logics will be implemented for incremental run once I.get new and updated records from source table.
Respected Advisor
Posts: 4,173

Re: HOW TO IMPLEMENT ORACLE CDC TRANSFORMATION FOR DAILY INCREMENTAL LOAD IN SAS DI STUDIO.

When loading from Oracle to Oracle then using the DIS SCD2 loader is sub-optimal as it doesn't do all the processing in-database. So if it's about performance then first step will be to implement a custom transformation for this.

 

What volumes are we talking about? Source, Target, Deltas (increments)?

 

As for CDC: Has it been turned on? Would you know where to source the data from and do you have access?

https://docs.oracle.com/cd/B28359_01/server.111/b28313/cdc.htm

http://www.dba-oracle.com/t_streams_change_data_capture.htm

 

Are you 100% sure that there is absolutely no column in your target table which would allow you to identify the deltas?

 

Ask a Question
Discussion stats
  • 5 replies
  • 435 views
  • 0 likes
  • 2 in conversation