BookmarkSubscribeRSS Feed
india2016
Pyrite | Level 9

Hello,

I have one query, I am working on observations which are automatically updated in the Oracle database (obs will be updated in only one table and that is used in ETL) And then runs ETL on that data-set at the specific time.

 

I already made ETL which runs at night which takes two hours. now I want to change in  ETL which runs only on newly updated Observations from last used data and give final output.

coz I have to save time by working only on those observations which are updated instead of processing the whole observations.

e.g.

ETL runs at 2 AM
and ends at 4 AM


now if I am running ETL manually at 3 PM then this ETL only works on Updated observations(i.e. only on 2 AM to 3 PM data and gives output. )

..What should I do for this task.

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, not sure how you would go about this in ETL as don't use it, however what I can say is this.  Oracle normally has a system field for last date/time of change.  You should be able to use this in the SQL you use to extract the data, so its just a matter of putting a where on your SQL to take records only where datetime of last is after your given point:

proc sql;
  connect to oracle...;
  select * from connection to oracle (
    select...
    from...
    where  LASTCHGTS >= "01FEB2017T12:00"dt);
  disconnect from oracle;
quit;

 

india2016
Pyrite | Level 9

How would Oracle or SAS knows which are the updated observations?

isn't It only depend on date variable?.

 

Actually, I don't know which observations will be updated.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

As I said above, Oracle generates a date/time stamp on data when something changes in the data row.  You query against that date/time to get rows which have changed in the timepoint your are interested.  At least Oracle Clinical does anyways.  I would ask your Oracle Administrator to provide you exact code to extract your data from your system with your requirements, as I have none of this information.

Patrick
Opal | Level 21

@india2016

I'm not sure if Oracle always creates such a datestamp as @RW9 wrote but it's certainly something which can be implemented in Oracle (i.e. using CDAC). 

I guess you're not only interested in updated records but also in deleted and newly inserted ones.

 

Have a chat with your Oracle DBA on how to best implement for your requirement. May be everything is already set-up and ready for you, may be there are first some configuration changes on the Oracle side required.

 

If I would have designed your Oracle table then it would have a created date and an updated date column. Check your table if there is already something like that there. If it is then you need nothing else than implementing a control table where you maintain the last date you've run your ETL and pick-up all records with a more current date - and then update your control table with the most current date you've processed.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1026 views
  • 0 likes
  • 3 in conversation