DATA Step, Macro, Functions and more

Detecting where changes occur in large transactional database

Reply
Super User
Posts: 17,868

Detecting where changes occur in large transactional database

I have a transactional database, approximately 120 million records and 100 variables across.

A record is entered and then goes through several changes before it's finalized, with each change entered as a distinct transaction. 

If I look at only final records I have only 30million so on average a record goes through 4 changes.

I'm relatively new to this database and the main field doesn't change much but I'd like to find the best way to identify which fields are changing the most between the transactions until the finalization.

Any suggestions on how to efficiently analyze/solve this?

Trusted Advisor
Posts: 2,113

Re: Detecting where changes occur in large transactional database

Efficiently, no.

However, if you are looking for estimates, start with a random sample of transaction sets and use PROC COMPARE on the pairwise evolution.  You can output the results and summarize over time.  5,000 sets is probably enough to get a handle on what is happening on average.  If you are searching for the unique or outlier changes, you are stuck with working with the entire dataset.

Valued Guide
Posts: 3,208

Re: Detecting where changes occur in large transactional database

You could use an audit trail when that transactional database is a SAS dataset http://support.sas.com/documentation/cdl/en/lrcon/67885/HTML/default/viewer.htm#n0ndg2uekz7qkbn1caok...

The same concept is used in OLTP DBMS systems alsof often names as journals or log files. They can be used to roll-back / roll-forward recovery processes.

When that is used those updates are commonly getting out of sync with extracted versions.  

---->-- ja karman --<-----
Ask a Question
Discussion stats
  • 2 replies
  • 213 views
  • 0 likes
  • 3 in conversation