We've a process where clients will update the file (say sales details) when there is a need and they intimate us about the change. So our job is to load those updated records to SQL DB table by maintaing the history.
For the first time while loading the data to SQL we just did a 'proc append' to SQL table after reading the file via data step infile Statement. However this method will not work to track the history in SQL when there is a update in the file.
I'm not Aware of SCDs as well. We are not using any Primary/foreign key stuff in SQL.I would like to understand how I can tackle the situation to update the history. I tried with 'Proc SQL update' and 'If first. and last.' but it didn't helped either. Appericiate your help here. I'm OK with any approach to implement this exercise.
Assume I've a file like this.
Id | Name | Sales | Transaction | Example Values | Act_Ind | Valid_From | Valid_To |
1 | Unit | True | 6 | 58001 | Y | 19000101T000000 | 99991231T235959 |
2 | Key | True | 6 | 121216 | Y | 19000101T000000 | 99991231T235959 |
3 | Value | True | 18 | 820595,2 | Y | 19000101T000000 | 99991231T235959 |
Assume I've received an update to the file like this. If you notice Transaction value has changed for Key from 6 to 8.
Id | Name | Sales | Transaction | Example Values | Act_Ind | Valid_From | Valid_To |
2 | Key | True | 8 | 121216 | Y | 20200404T000000 | 99991231T235959 |
So now I want data like this SQL table.
Id | Name | Sales | Transaction | Example Values | Act_Ind | Valid_From | Valid_To |
1 | Unit | True | 6 | 58001 | Y | 19000101T000000 | 99991231T235959 |
2 | Key | True | 6 | 121216 | N | 19000101T000000 | 20200403T000000 |
2 | Key | True | 8 | 121216 | Y | 20200404T000000 | 99991231T235959 |
3 | Value | True | 18 | 820595,2 | Y | 19000101T000000 | 99991231T235959 |
That's normal SCD2 loading with - based on your sample data - business key column ID.
Do you get delta records or full volume records where you have to work out if something changed?
If delta records: Do you get updates and new inserts together or separately? And if deltas: Do you also get deletes?
In the change records: Do you really get below highlighted columns already from source?
For expired records without a new record (a "delete") but latest record added: What value would Act_ind have? "Y" or "N"?
I will get delta records separately and in that there won't be any fields like 'Act_Ind' and 'valid_from' and 'valid_to' . It is our duty to insert/update the changes to the main master file and change the 'Act_Ind' and 'valid_from' and 'valid_to' fields.
"I will get delta records separately"
Details matter so please be specific:
- How many delta files per load cycle do you get? - 1, 2 or 3 - if not 3 then what's combined (update, insert, delete)
- For expired records without a new record (a "delete") but latest record added: What value would Act_ind have? "Y" or "N"?
- What data volumes are you dealing with? - Both in the target table and in the delta files
- Are you using DI Studio?
What are the keys for the table?
So when you get an update does it say whether it is Insert, Delete, or Update? Or do you have to figure that out?
Once you figure it out then decide what to do in each of the situations.
For INSERT you should be able to just add a new observation. Make sure to set the from/to timestamps.
For DELETE you should just update the existing record changing the status and to timestamp.
For UPDATE you need to convert it to an INSERT and DELETE. See above.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.