Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Accepted Solution Solved
Reply
Super Contributor
Posts: 418
Accepted Solution

Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Hello. My company only has Base sas (NO sas ETL).  Due to the power of base sas and what i've been able to do with it; along with it's flexibility, people from the company would like to determine if Base SAS can be used as an enterprise ETL solution (or an enterprise solution in general for data movement, transformation, etc..)

My Primary Question:

I am curious, do any companies use Base SAS as their primary tool for ETL related work?

My secondary question is, if you have scripts in sas that you want to run, but allow other users to run, how do companies deal with this problem?  AN example would be: I wanted to let "susan anyperson" run program ABC, which has two user inputs.

I know I could have her run a batch file with the specific inputs, however she doesn't know what a batch file is, and she would need access to the computer SAS is stalled on....  How does one get around this?  Is this the point of server SAS?

Reasons for Question:

I ask because my understanding implies base SAS has some pretty big limitations that I am not sure how other companies get around (if it is used). An example would be that base sas can only be called from the command line, so it cannot be set up to run from a sql server database on any triggers; unless you already have an SSIS package to call it. If you already have an SSIS package to call it, why not just write the entire ETL in the SSIS package?

I've never used SAS ETL, but how would SAS ETL get around the same problem?  Let's say you had a database that supposed a user applcation, and any time a new unit (loan) was added you wanted to run some sort of etl and move that data to a secondary database... This is a trigger based event, how would one call SAS (base or ETL) to do this task?

Thanks for your time!


Accepted Solutions
Solution
‎09-05-2014 10:48 AM
Respected Advisor
Posts: 3,124

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

I think maybe you are squeezing Base SAS too hard in term of cutting corner for your organization. I am not saying it can't be done, it will make your process so complex that it may not worth the effort. Given the toughening regulatory environment (such as how do you handle user credentials),  I know I can't do it upfront just for compliance and IT security reason for my organization.

You may already know the ETL tool from SAS family is called SAS DI, and it is heavily metadata driven. It has everything you asked in your question, scheduling, user management, security,  relational database etc.

FYI for task scheduling:

Four ways to schedule SAS tasks - SAS Users Groups

Regards,

Haikuo

View solution in original post


All Replies
Valued Guide
Posts: 3,206

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Interesting questions anotherdream as this is the world of dreaming in IT-strategy (the future not yes known) as opposed the the  tactical ones (what we can do know).

SAS has been around for 30-years now. It has always been used also as ETL-tool (delivering data) running SAS-code. I am convinced this a the major usage if we would get some insight on the processing. It is about getting the data (Extract Transform Load).
When this is done by the operational IT, it  is scheduled processed on a server. User input at a dedicated location. You can see those operational IT as those guys that did put in the punch-card  as operator task in the 60's.  They got newer and other reincarnations as a RDBMS-DBA (restricted) or SSIS-DWH builder or -.

The easy recognition is: a/ your are not allowed to do anything you should ask / very humble/  them  b/ Their tool can do anything all other tools are rubbish c/ cooperation communication are difficult words.    (:smileydevil: kidding but recognizable:smileygrinSmiley Happy

You are right that is the point of a server SAS approach.      Instead of just batch, old IT, there are more ways like a SP (Stored Process) web-pages office plugins. 

Now going back for SAS-DI and ETL.
With SAS-DI all is about metadata the metadata is the description of the data. You could see SAS datasets labels/variables/types as a nice start, you know them.
Now think to put all that information in a database. Ok .... That's is a metadata database .  No data but the descriptions of all data.
Having that you can define how data /records/fields relations between records and fields has to be processed (some coding as transformation) in a schematic way. this is just designing and there are a lot of tools doing this.

Why would you use SAS-DI? It has this metadata and and it has the code generation (SAS code) generated from that schematic presentation. (all in a nutshell).

You are mentioning SSIS that's is MS, nice to see is:  SSIS Tutorial: Creating a Simple ETL Package  The concept Integration Services (SSIS) Packages what SSIS is calling a package is SAS calling a Job in DI. The package with SAS is used for a collection of jobs in a deployment approach. SSIS is very only Microsoft (lock-in).
That release management approach of develop test acceptance production is a common business requirement but not very clear with toolings.  

There was some question of you whether it would be possible to call SAS within SSIS. I know SAS can be called from VBA java using the SAS server environment. Eguide is an example of a MS .Net application. I think the answer for that should be yes it is possible.  Would you like to do that? For that I would only when alternatives are more complicating. Having as less dependicies as possible makes more sense.  



       

---->-- ja karman --<-----
Respected Advisor
Posts: 3,825

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

On a general level my answer would be: Yes, you can do all these things with SAS. From what I get you're currently having Base SAS with a PC licence. So there I would say: What you're having in mind here needs a server whether it's done with SAS or any other software.

If your company is really serious about such an extension of its SAS usage and also would consider to invest in additional hardware and software (and people as someone needs to actually implement and maintain all this new stuff) then I would contact your local SAS office with a request for information.

Super Contributor
Posts: 418

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Hello Pat and Jaap! Thanks for your responses.  I guess you did answer my first major question (people do use SAS for ETL work), however I'm still stuck a bit on how a server implemenation would help my other issues.

Still confused on:

Example: IF susan any girl needed to be able to run a job at any time of the day on the server, and she's not a sas user, how would she do this?

Is it the fundamental idea that this CANNOT happen and that's what your sas programmers are there for?  I guess i'm confused because that defeats the purpose of having closed looped programs that no longer need a programmer, but just a 'run person'.  I guess I don't see how you would pass this off to the business, even if SAS is on a server...

Does SAS on a server come with some kind of web interface that allows people to run programs?  If so that would explain a lot of my confusion.

Also curious about:

Say we have an application that writes to a sql server database (or any database, oracle, etc..) and we need an etl tool to immediately copy any data written to the tables to a secondary reporting table.  How would sas do this, since it cannot be triggered from the writing to the database?  The only way I know how to trigger SAS currently is through the task scheduler... Perhaps this is also solved by new functionality from a SAS server?

Thanks again!

Solution
‎09-05-2014 10:48 AM
Respected Advisor
Posts: 3,124

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

I think maybe you are squeezing Base SAS too hard in term of cutting corner for your organization. I am not saying it can't be done, it will make your process so complex that it may not worth the effort. Given the toughening regulatory environment (such as how do you handle user credentials),  I know I can't do it upfront just for compliance and IT security reason for my organization.

You may already know the ETL tool from SAS family is called SAS DI, and it is heavily metadata driven. It has everything you asked in your question, scheduling, user management, security,  relational database etc.

FYI for task scheduling:

Four ways to schedule SAS tasks - SAS Users Groups

Regards,

Haikuo

Respected Advisor
Posts: 3,825

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Example: IF susan any girl needed to be able to run a job at any time of the day on the server, and she's not a sas user, how would she do this?

There are multiple ways of doing this depending on which SAS modules you're having licensed and also what Susan actually needs to do. Only trigger a batch job or something like refreshing data in an Excel (where AMO, the SAS Add-In for Microsoft Office, is a tool which empowers business users). AMO requires a licence though. If it's just about triggering a batch job: You could for example having your SAS job under a scheduler which has a file watcher set up for a directory where Susan can create a file. Susan then creates a token file (just an empty file with a defined name) which then triggers the SAS job under the scheduler to execute.  


Say we have an application that writes to a sql server database (or any database, oracle, etc..) and we need an etl tool to immediately copy any data written to the tables to a secondary reporting table.

That really depends on the concrete scenario. Data replication is something databases are good at so if this is something to be done on the same database server then I'd probably would implement within the database without using SAS. If it's about replicating to another database then I would first investigate if there is already a ready-made interface available for such data exchange between the 2 databases. You can of course also call a SAS process which then moves this data.

It also depends how data gets changed within a database. If it's a scheduled process then the scheduler could call the SAS ETL process after the load process into the database. There is also CDC (change data capture) which you can use with SAS.

So in short: There is always a way of doing things with SAS. If it's the right way for your organisation and environment will depend on many factors.

Super Contributor
Posts: 418

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Hello Patrick.  When you note that you "have your sas job under a scheduler which has a file wateher set up for a directory"... Do you mean a windows scheduler or a SAS scheduler, or a custom built scheduler?

I only have access to a windows scheduler, and I am not aware of that functionality, however if it exists that would be an extremely elegant solution to a problem I have.

Respected Advisor
Posts: 3,825

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Hi

Commercial schedulers like Control-M or LSF have such functionality so that you can trigger a process on an event like "file arrival". I couldn't find something similar for the Windows Task Scheduler.

What the Windows Task Scheduler can do is trigger a process based on an event logged in one of the event logs. I'm not sure if it is possible to log "file creation" in a specific folder as an event. The security log would allow you to monitor a folder for activity by certain users or groups - but that's not really what you're after. May be there is a way and some more Internet research could reveal it.

What you can do: Write a script (eg. a .bat) which checks a folder for the existence of a token file and if found this scripts then starts the process you want (and removes the token file). Schedule the script to be run in regular intervals (let's say every 15 minutes). That should give you something close to a file watcher and it's not hard to implement.

Valued Guide
Posts: 3,206

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Anotherdream,
For the solving of the confusion. The Server environment that can come with web-interfaces (SAS-portal SAS-va), office clients AMO and Eguide usage (even more). I don not like to call MS-office as web based the AMO should be desktop based as plugin.

Within the BI/DI server environment that is coming with that you have Eguide and the SAS-Foundation coming in. All you existing code should be able to run as long you did not hard-coded physical names. So far nothing real new thing for you. 

Let me go into some roles .
Please let me some IT guy for you (some of those devils). You are the analist/developer. For me that is almost the same. That is a real problem for me as most people are not seeing in the same way. You have some analysis that Susan should be able to run or use (not update) .

We will setup an environment you can do your work of building some code (EGuide). Then there are some new features, SP Stored processes and a prompting framework. The SP is your SAS code that need to get a registration in the SAS metadata,  with that registration you can connect those prompts to it. Promtps are defined as pop-ups at runtime that should get some value. Not really very complication for your SAS program as it sets as SAS macro variable for your program. You already know what sas macro variables are.

Works this well verified/test, we can handover your program to be used as production and available for susan. It is making just some copies.
What we have chosen as tool (one of the several mentioned) she can used to run that.

Your curiousity:
Suppose you have a lot of databases with sources. Using ETL is also thinking on the DWH design. It can be as simple as a "data lake" where all data is dumped to a complexity of fully build aggregated cubes (OLAP). In this process a scheduler / trigger / system on the server is major key component.  SAS is mostly promoting LSF also part of grid computing for this. It is called sometimes different but it is LSF. By that you have time events but also some other triggers as a file-event Sending files to a dedicated location can be recognized. There are many ways to solve the operational process..

---->-- ja karman --<-----
Super Contributor
Posts: 418

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

Ah, so a lot of the ability I need is simply in a different product inherently form Base SAS, and that's how most companies use it...

Thank you Hai.kuo that is the best and most consise answer I could have hoped for... I had strongly suspected that this was the case however I kinda needed some 'experts' to fall back on (I wasn't sure if it was because of my lack of knowledge, or because base sas just isn't supposed to do what I'm trying to get it to do!).

Thanks again

Valued Guide
Posts: 3,206

Re: Do companies use Base SAS as an ETL tool, or an enterprise data solution tool?

anothedream I mentioned that as it normal functionality of a professional scheduler. SAS scheduling is based in the LSF scheduler (platform computing, bought by IBM)

---->-- ja karman --<-----
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 1340 views
  • 1 like
  • 4 in conversation