BookmarkSubscribeRSS Feed
snoopy369
Barite | Level 11

Hi, I have a Data Integration Studio project that we use to do ETL for a survey that is fielded on a regular basis.  Each round, we will have a nearly identical DIS job, but not *completely* identical, extracting data from one place and putting it in another.  (It will be identical in how it works, but there may be updates to the logic or the objects each round, and the physical location of the source and destination libraries varies per round.)

 

The way we plan to implement this is to have a Metadata folder for each round, with the jobs and the source and target libraries defined in it.  To deploy a new round, we will copy one round to a new folder named for that round, rename (in Management Console) the libraries (which will be automatically renamed to something like "Library Name (1)") to something that indicates what round they are for, and then move forward with any updates to the programs or metadata.  DIS/Management Console seems to handle the metadata objects well there.

 

The question here is twofold: one, is this a good idea (or the best practice); and two, will there be any difficulty with, for example, our libraries having the same SAS libname but pointing to different folders? 

 

For a sense of scope, we have about 80 jobs in this project, each of which has quite a few steps, sometimes including user written code (we avoided it when we could, but it's not always avoidable).  

 

We would love to find a solution that didn't require multiple copies of code existing, but don't see a way to do it that way - we need to have multiple rounds active at once (both because we could have two rounds *in field* at once, but also we typically continue to refine the ETL after the round is finished fielding as we find new errors in the survey implementation or the ETL).  

 

Thanks!

3 REPLIES 3
Patrick
Opal | Level 21

A libref must be unique within a metadata repository. SAS won't let you create duplicates.

Normally ETL jobs should remain static and any change go through a full SDLC so creating copies per round doesn't feel like an optimal approach to me.

 

What you could attempt to do is to create data driven processes which a "static" ETL process then executes. So for example if you've got a set of rules then have this rules as data and create an ETL which consumes this data and generates and executes code during runtime. The same applies for libraries. You can have the path-names in a parameter file which you use to generate macro variables. You then define metadata library definitions which have this macro variable as path.

snoopy369
Barite | Level 11
Well, SAS happily seems to create a duplicate for me when I copy the folder / import the SPK - different _name_ but same _libref_...

Honestly, I'd far prefer to do things as you say - data driven - but DIS doesn't seem to encourage that. If I'm going to do that, no reason to use DIS at all really from what I can see...
Patrick
Opal | Level 21

@snoopy369 wrote:
Well, SAS happily seems to create a duplicate for me when I copy the folder / import the SPK - different _name_ but same _libref_...

Honestly, I'd far prefer to do things as you say - data driven - but DIS doesn't seem to encourage that. If I'm going to do that, no reason to use DIS at all really from what I can see...

You're right. Just tried and I could create two metadata libraries with the same libref. This will certainly cause trouble if SAS tries to assign both of these libraries (like the 2nd one overwriting the first one).

 

As for using DIS or not:

I guess that depends on the bigger picture. Don't forget that you also can implement re-usable custom transformations. 

I've actually built this year such a data driven process. There is an "engine" which uses control data to execute pattern jobs - so it's fully data driven. I've implemented this engine as a custom transformation with code that was to 95% a copy from the DIS loop transformation.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 500 views
  • 0 likes
  • 2 in conversation