05-27-2014 05:23 AM
Hope I am in the right place posting my first question here.
Let me sketch my situation first.
In our organisation complex query’s build with SAS EG or SAS DI Studio have to be based upon a functional design when
the information is used for strategic purposes or is being used to inform external parties.
This design describes the source systems, logic and business rules used in the SAS programs to get
the desired information in the desired lay-out.
It is my new role as an information analyst to go through the specs of the information requests and
discuss them with the users eventually setting up the functional design
specifically for the SAS domain. With these documents the SAS programmers build
the program and the business users are assured the program delivers what they want.
Now the method our organisation uses to bring these documents to life is outdated and out of
place. The templates are based upon documentation for mainframe applications
and not for the SAS domain. Older functional designs that need altering are a
complete drama in terms of text, lay-out and the lack of usable diagrams.
Now I want to give this documentation method a boost. I have a nice skillset of
SAS, system/business knowledge and a software engineering background. But
before throwing it al to good use I’m searching for some best practises. A
There are plenty of methods and best practises for designing a website, a .NET or a JAVA
application. You got DFD’s, ERD’s for your data flow and data model, UML for
your classes, Mockups for the software lay-out. A lot to spice up a functional design.
But I have been to the fifth perhaps the seventh Google search results page searching for things like:
“functional design SAS”, “data flow diagram query’s”, “ Documentation complex data extracts”.
But I could not find the good stuff I was looking for. So I come to you SAS users.
Hope some of you folks got time to read this and come up with some good pointers.
If more details are needed please let me know.
Thanks in advance.
05-27-2014 05:38 AM
Well, I think SAS is used in several very different businesses. The include, but not limited to, Finance Sector, Pharmaceuticals Development, Academia, etc. I think you will find each of these areas works in very different ways, for instance Finance I would imagine have far more connection to Excel, whereas Pharma is more focused on Oracle Clinical/Medidata.
Therefore I think you need to narrow your field a bit there. As for documentation, my opinion and being in Pharma, is that we are a long way behind many other industries with things like standardization, and documentation thereof. CDISC for instance is one big area of standardization which is slowly taking hold, but in my experience the kind of metadata driven programming is just not around. I can't speak for other industries. Documentation for instance tends to be limited to Word documents, Excel sheets, still with attempts at versioning in the document titles (e.g. V1, or 12APR, etc.). None of it is linked and tends to manual.
So to conclude, all I can really point out is that there is CDISC, pharma specific, for data modeling, and you could maybe also look at Enterprise Guide which is a visual process tool, but I am not sure you would find much else from my experience.
Best of luck.
05-28-2014 08:00 AM
Thanks for the quick reply RW9!
It is indeed best to narrow it down to the business it is meant for. Which in this case is Health Insurance.
I quickly looked at the CDISC and it has some similarities with (Standardization) Organisations here in the Netherlands regarding healthcare.
But I'm not really looking for data-modeling best practises.
I'm searching for a decent way to easily visualize data being extracted from a modeled environment and make the process of turning it in to information quickly understandable to non-technical users.
You mentioned Enterprise Guide, which we use here, and that is basicly what I'm looking for.
The problem is that the EG projects in our organisation are far to complex to illustrate users how certain things are done.
Either there are too many QueryBuilder blocks or all the logic is hidden away in Program blocks. So I need to go a level above that.
That is why I believe that DFD's (Data Flow Diagrams) could sketch a better picture than EG projects or plain text Word Documents.
Only I can't find any examples of that technique or any other technique or method being used in this context.
So I started asking around before I start making changes on my own to see If I could learn a thing or two from other industries.
Many thanks again for the reply.
Hope you or anyone else can come up with some cool idea's.
Any pointers towards a different community where I can post these kind of questions/discussions are welcome too!
05-28-2014 08:57 AM
Casper it looks like the continuation of a phone call a few hours back (local NL), but your mentioned background is different.
The first thing is knowing what is SAS about.
Eguide AMO are build .NET, DI-studio Miner SMC are build in java. Al lot of other things in SAS are build in C. With SAS you are building on top of those 3-GL's. That is a long time back for 4Gl and 5Gl went out a buzz-words. Add-ons to those tools can be made using their hooks.
Eguide is a tool for analysts and developers as that is never possible to get all the details. Imagine you are asking the OR analytics or ETS time series background. That are specialistic area's. This is also the reason that within the SAS-metadata with roles is possible to give those people different pull-down menu's in Eguide.
That option you will not find with Eminer (Enterprise Miner) as that is another specialized group of users.
DI (Data Integration) studio is the tool for generating SAS code / queries in a graphical way. It goal clicking and let the meta-datamodel that is build in the SAS metadata base do the work. It is a graphical interface with forward/backward impact analyses options. It is conforming the open metadata interface so you could import/export that part of modeling.
When all is build into SP (stored Processes) you have a possible version management for developers and on top of that release management (DTAP develop test accept production) for deployment. The SP's may be run with Eguide/Amo/Portal or whatever you have available and rolled out. The main thing is having your security design and logging (OS layers) accordingly setup for all requirements. That is about the hardening part and implementing a SIEM (Security Information & Event Monitoring).
The APM package could be a help for that (9.2 - 9.3). The eventmanager (9.4) is inheriting a lot of APM.
The deployment in DTAP segregation often comes with a combined segregation in machines. The minimum for a well segregated environment are 11 metadata servers and 4 (logical) machines. This includes a technical segregation between normal operations and new technical development. A private cloud could handle that.
The approach of a private cloud will assume the users are not bound to their own physical machines. Yes they like that to have for performance reasons.
Not really new stuff the concepts and the way to solve that are well known if you have some background on how to work on mainframes.
The bad thing is a lot of SAS people are failing in this part, missing the why and backgrounds.
ISO27002 Cobit ITIL HIPAA SOX-404 Solvency SAE16/3402 all have a lot that are becoming more mandatory. Yes becoming ..... as mostly they are ignored and "continuing as we has always done". That is why you will not find a lot on your searches.
Tip: do a try google for - sas hardening - or - sas dtap - (do not get shocked)
06-02-2014 03:37 AM
Thanks for the reply Jaap,
You just sketched another dimension for my quest.
I do believe that it would be very nice to sort of attach this technical dimension into the functional world.
Giving the users insight in how and what kind of data flows into different layers of DTAP for example would improve communications and overview of the BI landscape.
However part of this technical layout falls under architecture and is therefore out of my scope (for now ).
It lies mostly with the enterprise architect and SAS admins.
I'm searching for ways to sketch the end product and blueprints of complex data extracts (eventually landing in production).
Source systems, entities, attributes, filters, joins etc. How can we visualize all what is used inside a query (In a simple way!) to give end users insight and trust that the data extract is build proper and tackles all their needs..
The DTAP layer or application it is being build in is in terms of functional documentation not required. (It is however in the technical documentation).
RW9 mentioned EG for being a Visual process tool. It would be really cool if you could forward and reverse engineer a more simple functional flow from a complex EG project. Kind of in a way you can extract UML class diagrams from JAVA or .NET code and vice versa.
It is somewhat satisfying to hear that it is not being done so I have a clear canvas to work upon and really test what the business users would expect.
On the other hand it is kind of disturbing that large organisations (Assuming people viewing this and not replying are working for fairly large companies) are not documenting this.
I wonder how they keep their businesses, and perhaps even harder, their accountants happy?
I am thinking about blogging my progress developing a small standard for (functional) documentation regarding data extracts.
This way I can keep the discussion going while making progress and provide people with at least some hits on their searches.
If anybody has some other idea's I am more than happy to hear about it.
I'll give another post when (and if) I started my Blog.
06-02-2014 04:40 AM
Casper I am seeing you are within the health-insurance area that is more about administrative visisons. I positioned RW9 into the health-research where CDISC is more prominent.
There is big difference as the world of IT to the construction building World. Engineering and architecting with constructions has a high maturity level. Within IT you have more a political personal arena to deal in. As you are still young and idealistic there are your opportunities and threats.
I mentioned the DTAP approach as for me all type of information ("records") should be handled the same way in the same technical containers.
Not matter whether it technical stuff (code) metadata or documentation. I do not understand why you would make separated storage locations for those (top view).
Yes I know in real-life the discussions in used tools and silo organizations is causing that. Architects missing that high level top view or political not wanted approach.
- you can get all information into the SAS metadatabase.
- the end users can view this kind of information using DI (impact analyses) for data-records/data-fields
you could use something like: Metacoda Custom TasksVA it is becoming possible different but also that is using the SAS metadatabase.
Having this goal you there is a big challenge as transition. Usage of Excel and very complex Eguide projects is done by lack of good IT support solving those business questions. With Excel you are not considering any reverse engineering. The things that have been build with Eguide should be reviewed in what the real issue is/was.
06-02-2014 05:39 AM
"I mentioned the DTAP approach as for me all type of information ("records") should be handled the same way in the same technical containers."
Absolutely right. That is basicly why I can't do much with the DTAP approach in my functional design. Whether the software runs in Development or Production it should in both containers work in the way I described it in my design.
"I do not understand why you would make separated storage locations for those (top view)."
The difference that I probably adressed is due to different data.
Developing and testing should (but can not always) be done with dummy data instead of production data (which makes developing a lot easier too).
You can get pretty far with just metadata as for sourcesystems, entities and attributes. And I would prefer DI to view that but the tool you mentioned is pretty cool.
The problem is your missing the business logic then. Why pick that attribute? What filters or joins do we apply and why? And you have to keep in mind that an accountant is probably not capable of finding his way in the meta-data.
That is one of the reasons why we (I) make it simple for him in a functional design.
I do believe Jaap that we have a different vision since I am having some difficulties linking your answers to my main question. (Which can offcourse be related to me being new to BI and fairly young (you guessed huh? ))
Nontheless it is very insightfull to have this discussion and it gives me a broader vision. So thanks again for taking the time to discuss with me!
I do believe a blog to get my idea on rails including some more pictures/diagrams would be a good thing (much like your page Jaap).
So that people like yourself can get a good hang of the direction I mean to be going and can shoot on my idea's.
06-02-2014 06:53 AM
Casper, it not guessing but I have seen you already get the notes of that :smileygrin:.
Back to building a BI environment (reporting) you could extend that an analytics usage. There are different requirements on those parts.
The analytics part is impossible to do on faked data. That is the data-mining world, forecasting etc.
The first of BI is based mostly on a DWH. The cubes and the resulting dashboards is often classified as "business rules".
The "Data Vault" design (instead Inmon/Kimball) of that is latest one. Going for a "data lake" is thinking without business context.
The problem with those cubes is you have to build all options users possible are wanting into that (using DI/ETL tools). With in memory self-service tools that could be bypassed.
A DWH can be designed developed (version management) tested verified/accepted and brought into production (release management). All queries filters etc are brought into this kind of system and by that being audited. There are a lot of options for documentation inside SAS metadata including storing of windows files (pdf/word).
By the way I am missing "Information Map Studio" in your tooling. It is building views filters inside the SAS metadata on source tables.
An accountant is some person having a skillset. If he is only able to work by a list an wanting some answers as yes/no (do you have a backup?) the quality is to doubt.
I have also seen accountants not able to operate the tools (why should they know all), but being very well prepared.
Getting to some stuff an admin has to work hard for that. Just sitting aside together and questions: show me that and that, how do you, can you get that out of loggings .