SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

SCD Type 2 - DI - Beginning Date

Reply
Contributor
Posts: 58

SCD Type 2 - DI - Beginning Date

I'm using the SCD Type 2 Loader transformation in DI Studio.

I need to generate the beginning date '01JAN1900:00:00:00'DT in the first insertion of the natural key instead of using the function %SYSFUNC(DATETIME()).

In the second insert onwards, I want to use the function %SYSFUNC(DATETIME()).

Can someone help me?

Super User
Posts: 5,424

Re: SCD Type 2 - DI - Beginning Date

Posted in reply to DavidCaliman

I don't think that you have that functionality out of the box in SCD Type 2 Loader.

One way to do it is to first do some kind of look-up to the target table to identify is this is a new key or not. If so, set your initial value in the beginning data column. Then have that column mapped in the SCD Type 2 Loader (if the columns is mapped, it will that one prior to the specifications on the change tracking tab).

Quite a job, especially if you wish to do it across all tabels in your DW.

Could be a feature to ask SAS to implement in future releases.

Data never sleeps
Contributor
Posts: 58

Re: SCD Type 2 - DI - Beginning Date

There is a field of expression to the beginning date.

Perhaps it is possible to develop an expression there with the logic that I want.

Super User
Posts: 5,424

Re: SCD Type 2 - DI - Beginning Date

Posted in reply to DavidCaliman

If I understand your requirement right - I doubt that you can. You need to know whether this is the first record for this key, and this information is not available in the mapping step.

What you also can do is to run the code in debug mode - which will leave the temporary tables undeleted, such as etls_newrecs (check the name - I don't have the SCD code at my fingertips). Pick the keys, set your beginning date, and apply a second update to the target table. This could be done in the transformation post code.

Data never sleeps
Respected Advisor
Posts: 4,173

Re: SCD Type 2 - DI - Beginning Date

Posted in reply to DavidCaliman

I don't believe such functionality is available as part of the SCD2 loader. has a nice idea with using "etls_newrecs". I believe though that this table contains all records to be appended. If so then you would need to compare first with the "etls_updtrecs" (check the names) table and only select the keys not in this table to get the "first time" records.

I'm not really sure that it's the right thing to set the beginning date of the "first time" record to 1900. You could add the version number which would then allow you in queries to use row level selection (where version_nr=1) instead of a "group by ... having" clause.

What I personally prefer to do is to set the beginning dates to the datetime of the source data - information you might get as part of the source filename or some control table. Such an approach allows to have aligned begin dates in multiple SCD2 tables as well as having consistent values in case of rollback/reload or catch-up runs.

SAS Employee
Posts: 13

Re: SCD Type 2 - DI - Beginning Date

Posted in reply to DavidCaliman

Another option is to use the Table Comparision Transform to produce two data sets of New and Update records then handle the new records with one path - just append them, then the update records pass them into your type 2 scd transform...

Respected Advisor
Posts: 4,173

Re: SCD Type 2 - DI - Beginning Date

Posted in reply to Michael_SAS

If dealing with larger volumes how will such an approach perform compared to the SCD2 loader with a permanent XREF table?

SAS Employee
Posts: 13

Re: SCD Type 2 - DI - Beginning Date

I actually think it should be as fast if not faster because the compare table can also use a perm XREF.

Super User
Posts: 5,424

Re: SCD Type 2 - DI - Beginning Date

Posted in reply to Michael_SAS

Great discussion!

Allow me to sum up a bit..?

, you're absolutely right, you need to combine with etls_updtrecs, which requires some (minor) additional programming. I believe that you could make this in a macro, and avoid hard coded code. I also believe this would be the most effective solution (especially if you don't have a XREF for other reasons) in terms of performance, the look-ups have already been done by previous steps in the Loader.

Doing a pre step by comparing with the target table, or an existing permanent XREF table would be best practice, since it could be done by using standard transformations, and there would be no hidden code in post code or macros.

Data never sleeps
Ask a Question
Discussion stats
  • 8 replies
  • 1096 views
  • 0 likes
  • 4 in conversation