I've a SAS table like below which will update periodically. Here i wish to use some transformation to capture only the updated records in my output.
Id Name Salary Address
1 Ram 1000 aaaaa
2 Ravi 2000 bbbbb
3 Kiran 1500 ddddd
After some point of time this table will be updated as below by clients if we have any changes over the records.
Id Name Salary Address
1 Ram 2000 aaaaa
2 Ravi 2000 bbbbb
3 Kiran 1500 ccccc
So I need the output as below via SAS transformation for the changed records.
Id Name Salary Address
1 Ram 2000 aaaaa
3 Kiran 1500 ccccc
I've tried to achieve this via Table loader transformation with update\insert technique, but it is providing the output by holding the old records (like append transformation).
Can someone suggest the suitable transformation with the detail steps to achieve this?
think on adding date (date/time source) to the records as some information point.
I do not know what modeling method has been chosen. With a Data Vault Modeling - Wikipedia, the free encyclopedia approach is a principle to store that kind of info along all other record fields.
Hi Jaap Karman,
Thanks for your response. But I'm wondering for transformations like CDC, SCD type-1 to achieve this feat rather than your suggestion. Let me know if you required any further details from me.
Do you have a copy of the original dataset before it be updated ?
Maybe you should check Audit Trail .
Xia Keshan
Hi Ksharp,
I don't have the copy of the original dataset. If I'm not wrong, audit trail table will hold only the details like number of rows processed and not the output which I expect.
If I remember correctly , Audit Trail also will hold the updated information. Check the documentation . I don't do it for a long time .
Good Luck .
Even though the DIS loaders work out internally which records need update and which insert this information is only kept in work tables which get cleaned out at the end of the loader transformation. I'm not aware of any loader option which would allow you to retain these tables.
The 2 ways I can currently think of to get you what you want (by using DIS and avoid too much custom code):
1) Use an SCD2 loader as this will maintain full change history in your table. The latest updated records will be the once with more than one record per Retained Key and the latest Changed Date. The SCD2 loader is "out-of-the-box" but you would still need to implement some kind of reporting for record selection and if having a type 2 table is not a requirement then you would probably also want to have some housekeeping process which in the end removes all history records (once you've reported on them).
To make it easy for users of your data: You probably would implement a view over the type 2 table which selects only the current records so your users don't get exposed to a type 2 table.
2) You turn on an audit trail and then report on this audit trail as suggests. To implement the audit trail is a one-off task (that sticks with the physical table then). You still would need to implement some reporting and some house keeping to truncate the audit trail from time to time so that it doesn't "blow up" too much.
Both using a type 2 loader and an audit trail will have a negative impact on performance.
No matter what you do. Do not start coding before understanding concepts.
CDC Change data capture - Wikipedia, the free encyclopedia is just passing updates inserts and deletes. The disadvantage is there can failures in the process of synchronization as it is partial synchronization.
SCD-1 is one the Slowly changing dimension - Wikipedia, the free encyclopedia where data collection approach for building cubes. The type-1 one is a simple one but often not applicable as of the meaning of the changes.
Knowing your desing of the dwh building the processes you will see wich transformations are needed (Join merge set etc.)
The auditt trail of the original DBMS could be the source for a CDC approach.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.