BookmarkSubscribeRSS Feed
maddysbi
Calcite | Level 5

Hi Team,

               I have an input dataset in the format as below :

ItemIDItemnameitemtypetypenameitemgroupgroupname
1ABCXXXXXXXXXXXA123XYZ
2DEFXXXXXXXXXXXA123XYZ
3GHIXXXXXXXXXXXB456ZZZ

I need to know the process where I need to use the SCD type 2 retained key for my primary key which is Item id and surrogate keys for type and group accordingly.How should I process with in a single job.I'm new to DI and not able to get the exact flow here.

Also,SCD type 2 sometimes gives an error saying "The variable in the DROP, KEEP, or RENAME list has never been referenced".Please help me.

Expected output is

itemrkitemiditemnametyperktypetypenamegrouprkgroupgroupname
11abc1xxxxxxxxxxx1a123xyz
22def1xxxxxxxxxxx1a123xyz
33ghi1xxxxxxxxxxx2b456zzz
6 REPLIES 6
LinusH
Tourmaline | Level 20

"The variable in the DROP, KEEP, or RENAME list has never been referenced": you give us nothing to work on here - what mapping and options have you set in the transform, are your target table and meta data registration equal?


About your expected (or rather desired?). SCD Type 2 Loader is able to generate one surrogate key per target table. It looks like you are trying do design a snow flake here. As a general advice, don't use snow flakes. If you still wish to have RK for your grouping levels - why do you have the name columns in the same table?

So still, to create RK for each grouping, have each group stored in separate dimensions (snow flakes), and then do a look up to populate the Item dimension.

Data never sleeps
maddysbi
Calcite | Level 5

Hi LinusH,

              My bad!I forgot to mention that I have two more columns "valid from date" and "valid to date" in my target structure.I was using scd type loader and selected these 2 dates for change data capture along with the one retained key  as you said,trying to map the other columns.So you mean to say that I need to have different tables for each of them and the get the keys from them.Will try it out.Could you please highlight on that look up process as I don't have much idea on how it works.

Regards,

Maddy

LinusH
Tourmaline | Level 20

Take a look at the Look-up transform.

Still, I can't see what practical use you (or your users) will have of the grouping RK columns.

Data never sleeps
Patrick
Opal | Level 21

Before you start designing & implementing DI jobs you should get your target data model right. So how is it in your case?

- 1 item belongs to exactly 1 type, a type has many items

- 1 type belongs to exactly 1 group, a group has many types

In such a case you could create a dimension table an use ItemID as business key (and using SCD2 there would then be a single ItemRK). No need to try and create additional RK's.

You could also decide to split up things and create 3 tables: A Item table, a Type_Ref table and a Group_Ref table. You still could load these 3 tables using SCD2 (and then you get 3 RK's) In such a case you would first load the reference tables (Type_Ref, Group_Ref) and then look-up the keys before loading Type.

...or if "1 type belongs to exactly 1 group, a group has many types" then you could also create a single reference table with typeID as business key when loading using SCD2

maddysbi
Calcite | Level 5

Patrick,

            In my case the hierarchy is something as below :

Itemtype (highest)

followed by group

followed by item (lowest)

So,there can be many groups under one type and

many items under one group.Here item id is my business key which is unique but in char format.So i want to use the keys for increasing the processing speed.What I understood from above comments is that,

i have 3 tables first

1. type (rk for type)

2.group (rk for group and lokup to type)

3. item (rk for item and look up to type & group)

Is it correct?

Patrick
Opal | Level 21

If it's a true hierarchy then it's your design decision how you want to organize the data.

In one table it could be:  {Item_RK, ItemID, groupID, itemtypeID,....}

When loading using SCD2 then Item_Id would be the business key, all the other columns would be under change tracking.

With 3 tables:

- ItemType_Ref {ItemType_RK, ItemTypeID,....}

- Group_Ref {Group_RK, GroupID,...}

- Item {Item_RK, ItemID, Group_RK, ItemType_RK....}

If you go for a 3 table approach then you would first load ItemType_Ref and Group_Ref and then look up the RK's via the ID's before loading table Item.

If you're having full loads then I would go for a 1 table approach, if it's delta then I would eventually go for a 3 table approach.

The reason for 3 tables in case of delta loads: If something like the Group_Description changes then in a 3 table approach this would simply cause a new current record and all is set, in a one table approach you would have to update all records in the table with the new description (with a full load this would happen "automatically").

As a database purist one would say that you always need to generate keys. From a more pragmatic perspective it might well be o.k. to use the ID's as provided from source and not generate RK's at all. In such a case you wouldn't need to look up the RK's but you would simply use the ID's in table "Item".

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 3885 views
  • 6 likes
  • 3 in conversation