BookmarkSubscribeRSS Feed
Hercules
Fluorite | Level 6
Hello,

Sorry if the post is irrelevant to the forum but I really need some guidance.
I've written a SAS paper on Data Warehouse dynamic implementation which has reduced the time and implementation cost in my project also and I received an award for it. Hence it is tested and live.
I was thinking to write a paper on it describing the process and by sharing the source code so that anyone can use it in their projects.
I have posted it in sascommunity.org as an article before submitting it to get it published, as this is my first time that I'm formating a technical paper. I would be grateful if you can give your honest feedbacks about it and throw some light on my week parts.
Find the link below

http://www.sascommunity.org/wiki/%25UniCreate:_A_SAS_Macro_for_Enterprise_Data_Warehouse_Dynamic_Imp...
4 REPLIES 4
AbhiD
Obsidian | Level 7

Hi Anant, I read ur paper on dynamic creation of extraction query for sourcing data. as i understand the macro functionality will help in reading the tables from a source when necessary information is provided and all this is handled dynamically.

 

My Question

1 - Is it possible to filter out records when reading from the source

2 - Is it possible to handle joins in the macro itself.

 

Rgds,

Abhi

 

Hercules
Fluorite | Level 6

My apologies for the late reply. And Thanks for reading the paper.

 

1 - Is it possible to filter out records when reading from the source

Ans: Yes you can add one more column to the deriving table and name it as FLTR_COLMN at table level.

Then you can select the column along with table name in a macro variable and paste it in the proc sql; insert into claw.

 

2 - Is it possible to handle joins in the macro itself.

Ans: For hadling join you have to make a new table with columns: SRC_TAB, TGT_TAB, JOIN_TYP (in which you have to specify inner, left, right join) and CONDTN column. Storei it in a macro and call it. 

Patrick
Opal | Level 21

Hi Anant,

 

I've just read your paper - a bit fast so may-be so some of my comments can be me missunderstanding things because not having spent enough time with your approach.

 

Just a few thoughts:

monotonic() is an undocumented and unsupported function. There is a SAS Note about it. Therefore monotonic() should not be used in a production environment.

 

If I get that right then your code is mainly creating tables and copying data from source to target without any transformations. It does a lot of data conversion but: Isn't this what the SAS/Access engines are doing? If so then what's the purpose of your code?

 

In my experience when it comes to serious implementations you don't let SAS create the target tables in a database. You use explicit database specific DDL - eventually generated out of a design tool like Erwin. There are things like constraints, referential integrity, table partitioning etc. which you simply can't derive dynamically from the source metadata.

So normally you create the target structures explicitly - only then you implement your DI processes loading from source to target.

With SAS/Access a lot of the data conversion can be handed over to SAS - else you will need some "hand coding".

 

In my experience the main design work is data modelling and getting the data mapping right and the main implementation work is in data validation and cleansing and in the transformation step; and with bigger volumes implementing stuff so that it performs. That's nothing I can see your code solves.

 

Considering all of the above - and I might not do you justice as I haven't spent enough time looking into the details of your code - I don't really understand what problem you're solving here and I have my doubts that what you're proposing is a time saver.

Hercules
Fluorite | Level 6

Thanks Patrick for spending your valuable time while reading my paper. Being a Data modeler I completely agree with your point that DDL should be created from Data ERwin Modeler. But need of the hour is to automate a system which can save time and money.  What I am actually doing here is not playing around with any transformation, but to migrate the data from source to landing area from where we can clean and transform the data to migrate it into the staging area. I am more focusing on creating the DDL for the tables.

 

Now, for every table, the developer has to execute the query to identify the DDL at the source, and if I have 38 sources with 1000 tables in each source, then the developer has to manually fire the query for total of 38X1000 tables, copy its DDL, change it according to Teradata (Target) Database and then execute queries for all tables likewise. Likewise, he/she has to do this for rest of the schema i.e. Landing, Staging, Error, Archive, Incremental, etc.

 

As you can see this is time consuming, I proposed a way to eradicate the manual intervention of developers and make this an automated system. This is what I have implemented in my project also and it has saved us a lot of time and implementation cost.

 

I appreciate your feedback. Please let me know if I am not able to justify my answer. And I apologize for my late reply.

 

Thanks.

sas-innovate-2024.png

Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.

 

Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 656 views
  • 0 likes
  • 3 in conversation