CLUSTERID NAME SURNAME
121 Amit Sharma
122 Nikhil Devang
122 Nikhil Devang
Suppose If I am running a SAS DI SCD2 type transformation and the business key is Cluster Id and if I am generating surrogate key which is clusterno, then in the final table I am getting the values in the Active_FLAg for Nikhil Devang only in the last record of that id 122 as Y which I defined in the current indicator as Y
24JUL2019:14:42:11.074530 24JUL2019:14:42:11.074530 122 Nikhil N 10 Devang
31DEC2099:00:00:00.000000 24JUL2019:14:42:12.074530 122 Nikhil Y 10 Devang
What if I want Y in both the rows. Also if there is cluster_id in the final table. And if I insert two new entry with cluster id 122, then I would want Y for both the newly inserted cluster. How is it possible?
I want the below results
31DEC2099:00:00:00.000000 24JUL2019:14:42:12.074530 122 Nikhil Y 10 Devang
31DEC2099:00:00:00.000000 24JUL2019:14:42:12.074530 122 Nikhil Y 10 Devang
Only for last value in the active flag
Ideally post DIS questions into the data management forum to get more people with the right skills looking into your question.
What you're asking for is not possible - and rightly so.
An SCD2 table may not have more than one current record per business key. Also overlapping start/end dates for the same business key are not allowed and would create a mess if it ever happened. That's not a SAS "thing" but how type 2 tables work.
A business key is a primary key and should be unique in your source table (per load). You should implement logic for "de-duping" prior to the load into the type 2 target table.
The SAS SCD2 transformation deals with duplicates the way you observe it - just by expiring duplicates "immediately" while loading - but this just increases the volumes in your type 2 table without adding value.
....and I guess now it's also clear why there can only be a single change current indicator (active_flag) set to "Y" per business key.
Ideally post DIS questions into the data management forum to get more people with the right skills looking into your question.
What you're asking for is not possible - and rightly so.
An SCD2 table may not have more than one current record per business key. Also overlapping start/end dates for the same business key are not allowed and would create a mess if it ever happened. That's not a SAS "thing" but how type 2 tables work.
A business key is a primary key and should be unique in your source table (per load). You should implement logic for "de-duping" prior to the load into the type 2 target table.
The SAS SCD2 transformation deals with duplicates the way you observe it - just by expiring duplicates "immediately" while loading - but this just increases the volumes in your type 2 table without adding value.
....and I guess now it's also clear why there can only be a single change current indicator (active_flag) set to "Y" per business key.
If you load multiple records with the same business key in a single batch then you don't really have control over the load order within the tuple and which record will in the end become the active one. I'm no more sure if the SCD2 generated code maintains the sort order of the source table - but even if it does it's certainly not documented nor guaranteed behaviour.
I believe that if you want to load multiple changes ("duplicates") in a single batch then you would need to implement the SCD2 loader within a loop transformation and feed one "duplicate" per iteration in the sequence of the changes (with the loop executing in sequence and not in parallel).
"If I recall right SCDType2 doesn't do MD5() comparison between input rows"
You recall right.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.