Desktop productivity for business analysts and programmers

Working only on updated observations.

Reply
Frequent Contributor
Posts: 77

Working only on updated observations.

Hello,

I have one query, I am working on observations which are automatically updated in the Oracle database (obs will be updated in only one table and that is used in ETL) And then runs ETL on that data-set at the specific time.

 

I already made ETL which runs at night which takes two hours. now I want to change in  ETL which runs only on newly updated Observations from last used data and give final output.

coz I have to save time by working only on those observations which are updated instead of processing the whole observations.

e.g.

ETL runs at 2 AM
and ends at 4 AM


now if I am running ETL manually at 3 PM then this ETL only works on Updated observations(i.e. only on 2 AM to 3 PM data and gives output. )

..What should I do for this task.

Super User
Super User
Posts: 9,203

Re: Working only on updated observations.

Posted in reply to india2016

Well, not sure how you would go about this in ETL as don't use it, however what I can say is this.  Oracle normally has a system field for last date/time of change.  You should be able to use this in the SQL you use to extract the data, so its just a matter of putting a where on your SQL to take records only where datetime of last is after your given point:

proc sql;
  connect to oracle...;
  select * from connection to oracle (
    select...
    from...
    where  LASTCHGTS >= "01FEB2017T12:00"dt);
  disconnect from oracle;
quit;

 

Frequent Contributor
Posts: 77

Re: Working only on updated observations.

How would Oracle or SAS knows which are the updated observations?

isn't It only depend on date variable?.

 

Actually, I don't know which observations will be updated.

Super User
Super User
Posts: 9,203

Re: Working only on updated observations.

Posted in reply to india2016

As I said above, Oracle generates a date/time stamp on data when something changes in the data row.  You query against that date/time to get rows which have changed in the timepoint your are interested.  At least Oracle Clinical does anyways.  I would ask your Oracle Administrator to provide you exact code to extract your data from your system with your requirements, as I have none of this information.

Respected Advisor
Posts: 4,543

Re: Working only on updated observations.

[ Edited ]
Posted in reply to india2016

@india2016

I'm not sure if Oracle always creates such a datestamp as @RW9 wrote but it's certainly something which can be implemented in Oracle (i.e. using CDAC). 

I guess you're not only interested in updated records but also in deleted and newly inserted ones.

 

Have a chat with your Oracle DBA on how to best implement for your requirement. May be everything is already set-up and ready for you, may be there are first some configuration changes on the Oracle side required.

 

If I would have designed your Oracle table then it would have a created date and an updated date column. Check your table if there is already something like that there. If it is then you need nothing else than implementing a control table where you maintain the last date you've run your ETL and pick-up all records with a more current date - and then update your control table with the most current date you've processed.

Ask a Question
Discussion stats
  • 4 replies
  • 140 views
  • 0 likes
  • 3 in conversation