Hi Team,
I have an input dataset in the format as below :
ItemID | Itemname | itemtype | typename | itemgroup | groupname |
---|---|---|---|---|---|
1 | ABC | XXX | XXXXXXXX | A123 | XYZ |
2 | DEF | XXX | XXXXXXXX | A123 | XYZ |
3 | GHI | XXX | XXXXXXXX | B456 | ZZZ |
I need to know the process where I need to use the SCD type 2 retained key for my primary key which is Item id and surrogate keys for type and group accordingly.How should I process with in a single job.I'm new to DI and not able to get the exact flow here.
Also,SCD type 2 sometimes gives an error saying "The variable in the DROP, KEEP, or RENAME list has never been referenced".Please help me.
Expected output is
itemrk | itemid | itemname | typerk | type | typename | grouprk | group | groupname |
---|---|---|---|---|---|---|---|---|
1 | 1 | abc | 1 | xxx | xxxxxxxx | 1 | a123 | xyz |
2 | 2 | def | 1 | xxx | xxxxxxxx | 1 | a123 | xyz |
3 | 3 | ghi | 1 | xxx | xxxxxxxx | 2 | b456 | zzz |
"The variable in the DROP, KEEP, or RENAME list has never been referenced": you give us nothing to work on here - what mapping and options have you set in the transform, are your target table and meta data registration equal?
About your expected (or rather desired?). SCD Type 2 Loader is able to generate one surrogate key per target table. It looks like you are trying do design a snow flake here. As a general advice, don't use snow flakes. If you still wish to have RK for your grouping levels - why do you have the name columns in the same table?
So still, to create RK for each grouping, have each group stored in separate dimensions (snow flakes), and then do a look up to populate the Item dimension.
Hi LinusH,
My bad!I forgot to mention that I have two more columns "valid from date" and "valid to date" in my target structure.I was using scd type loader and selected these 2 dates for change data capture along with the one retained key as you said,trying to map the other columns.So you mean to say that I need to have different tables for each of them and the get the keys from them.Will try it out.Could you please highlight on that look up process as I don't have much idea on how it works.
Regards,
Maddy
Take a look at the Look-up transform.
Still, I can't see what practical use you (or your users) will have of the grouping RK columns.
Before you start designing & implementing DI jobs you should get your target data model right. So how is it in your case?
- 1 item belongs to exactly 1 type, a type has many items
- 1 type belongs to exactly 1 group, a group has many types
In such a case you could create a dimension table an use ItemID as business key (and using SCD2 there would then be a single ItemRK). No need to try and create additional RK's.
You could also decide to split up things and create 3 tables: A Item table, a Type_Ref table and a Group_Ref table. You still could load these 3 tables using SCD2 (and then you get 3 RK's) In such a case you would first load the reference tables (Type_Ref, Group_Ref) and then look-up the keys before loading Type.
...or if "1 type belongs to exactly 1 group, a group has many types" then you could also create a single reference table with typeID as business key when loading using SCD2
Patrick,
In my case the hierarchy is something as below :
Itemtype (highest)
followed by group
followed by item (lowest)
So,there can be many groups under one type and
many items under one group.Here item id is my business key which is unique but in char format.So i want to use the keys for increasing the processing speed.What I understood from above comments is that,
i have 3 tables first
1. type (rk for type)
2.group (rk for group and lokup to type)
3. item (rk for item and look up to type & group)
Is it correct?
If it's a true hierarchy then it's your design decision how you want to organize the data.
In one table it could be: {Item_RK, ItemID, groupID, itemtypeID,....}
When loading using SCD2 then Item_Id would be the business key, all the other columns would be under change tracking.
With 3 tables:
- ItemType_Ref {ItemType_RK, ItemTypeID,....}
- Group_Ref {Group_RK, GroupID,...}
- Item {Item_RK, ItemID, Group_RK, ItemType_RK....}
If you go for a 3 table approach then you would first load ItemType_Ref and Group_Ref and then look up the RK's via the ID's before loading table Item.
If you're having full loads then I would go for a 1 table approach, if it's delta then I would eventually go for a 3 table approach.
The reason for 3 tables in case of delta loads: If something like the Group_Description changes then in a 3 table approach this would simply cause a new current record and all is set, in a one table approach you would have to update all records in the table with the new description (with a full load this would happen "automatically").
As a database purist one would say that you always need to generate keys. From a more pragmatic perspective it might well be o.k. to use the ID's as provided from source and not generate RK's at all. In such a case you wouldn't need to look up the RK's but you would simply use the ID's in table "Item".
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.