BookmarkSubscribeRSS Feed
Reeza
Super User

I have a transactional database, approximately 120 million records and 100 variables across.

A record is entered and then goes through several changes before it's finalized, with each change entered as a distinct transaction. 

If I look at only final records I have only 30million so on average a record goes through 4 changes.

I'm relatively new to this database and the main field doesn't change much but I'd like to find the best way to identify which fields are changing the most between the transactions until the finalization.

Any suggestions on how to efficiently analyze/solve this?

2 REPLIES 2
Doc_Duke
Rhodochrosite | Level 12

Efficiently, no.

However, if you are looking for estimates, start with a random sample of transaction sets and use PROC COMPARE on the pairwise evolution.  You can output the results and summarize over time.  5,000 sets is probably enough to get a handle on what is happening on average.  If you are searching for the unique or outlier changes, you are stuck with working with the entire dataset.

jakarman
Barite | Level 11

You could use an audit trail when that transactional database is a SAS dataset http://support.sas.com/documentation/cdl/en/lrcon/67885/HTML/default/viewer.htm#n0ndg2uekz7qkbn1caok...

The same concept is used in OLTP DBMS systems alsof often names as journals or log files. They can be used to roll-back / roll-forward recovery processes.

When that is used those updates are commonly getting out of sync with extracted versions.  

---->-- ja karman --<-----

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 790 views
  • 0 likes
  • 3 in conversation