BookmarkSubscribeRSS Feed
All,



If you need to maintain a surrogate key in DI Studio, you have several options including the Type 2 SCD Loader or the Surrogate Key Generator transform. However, in some cases you do not need all the functionality attached to the Type 2 SCD Loader and you may not be able to tolerate another pass against your data to use the Surrogate Key Generator transformation. In this case, the method below may be of interest.



If you're already using the Table Loader transformation in 4.2.x, you may use the following steps to create surrogate key values without another pass against the data:


[pre]
1. Create code like the following in the"Precode" setting for the Table Loader:
PROC SQL noprint;
select max()
into :highValue
from .;
QUIT;

2. Map the expression below to the "Expression" setting for the surrogate key field in
the Mappings tab for the Table Loader:
&highValue + monotonic()
[/pre]



This works as follows:

  1. The PROC SQL query selects the current max value of the surrogate key field in the table into a macro variable (&highValue in this case).
  2. The expression above creates a unique surrogate key value during each iteration of the Table Loader read loop. This works because the monotonic() function returns an ascending value (starting with 1) each time it is called. So, adding this to the current maximum for the key in the table creates unique, ascending values.



    Note that the library for the target table above may need to be pre-assigned, since DI Studio will not have generated the libname statement for the Table Loader target table when the Precode is executed.



    Screen shots showing the two steps above are attached. Please let me know if you have any questions.



    Thanks,



    Tim Stearn


    SAS Product Management
5 REPLIES 5
LinusH
Tourmaline | Level 20
I'm not sure if this is a great idea.
-The monotonic function is not supported by SAS
-The described pre-step does a full table scan (a pass of data) if the table resides in a SAS Base libname. Indexing the surrogate key column won't help. If your data is in SPDS, this would work excellent and extremely fast, since SPDS will query the meta data portion of any existing index. If your data is an external RDBMS, it's likely that you have functionality of generating surrogate keys built in.

An alternative (if you have your target table sorted by the surrogate key) is to use data set functions to retrieve the last record and its surrogate key value. This is also very efficient. And if you don't wish to use an unsupported function, you could just use a data step view and calculate the key values using _n_ and the current max value retrieved earlier.

/Linus
Data never sleeps
All,

I had been waiting to reply to the question posted in this thread about whether it was safe to use the monotonic() function in the manner described in this article. As is sometimes the case with SAS Notes, it has proved challenging the nail down the exact circumstances in which monotonic() doesn't behave as expected when used in conjunction with PROC SQL.

Suffice it to say, however, that the approach described in the article was successfully tested prior to the post. Furthermore, I've been using a similar method for 3 years to generate serial numbers during the extract process to ensure that all extracted records will be uniquely identified. This I've done across hundreds of jobs with no unexpected results.

So, I feel confident in saying the method I've described is safe within the context of the article. You should not, however, use monotonic() in complex query situations (like subqueries) where you're depending on some kind of underlying record order in the SQL result set. Since this not the case for surrogate key assignment, this approach is valid.

Thanks,

Tim Stearn
Patrick
Opal | Level 21
Hi Tim

I strongly agree with Linus.

Even if monotonic() works perfectly for the case you're describing it's still an undocumented and unsupported function. I don't think its usage should be proposed for anything.

If I'm using monotonic() in an implementation and things go wrong what do you think SAS Tech Support would tell me?

I have of course no issues at all with monotonic() if SAS R&D decides to use it for automatically generated code.

But may be even better would be to offer an option to store and maintain the max key in a permanent table - as already done for xref tables in the SCD2 loader.

Cheers
Patrick

And just as a side note: It would of course be very nice to have monotonic() as a supported function because it's sometimes really useful. I admit to be guilty.
The problems with monotonic() I read about where in conjunction with Oracle queries – which was actually kind of hard to understand as Oracle has a similar function row() and I would naively expect that SAS just could translate monotonic() into row() and send it to Oracle for execution.



Message was edited by: Patrick
RMP
SAS Employee RMP
SAS Employee
An alternative without using the Precode -


create a column highValue with expression - MAX()
create the sequence column with expression - monotonic() + CALCULATED highValue.
VishantBhat
Fluorite | Level 6

Hi Tim,

Can you pass the code you have for generating this? I am not able to open the attachment and I think this would be very helpful for me.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 4557 views
  • 1 like
  • 5 in conversation