SAS Life Science Analytics Framework and the clinical data products from SAS

CDI Best Practices

Reply
SAS Employee
Posts: 8

CDI Best Practices

Let's talk about best practices in CDI. No matter where you're at in the process of using CDI - beginner, comfortably competent, expert - best practices are important.

I'd like to start with a couple of things on my "Never Do This!" list. Sometimes I hesitate to share these. After all, if you didn't know you could do these things, then you wouldn't do them, and there would be no need for me to tell you not to. And, if you're anything like me, you might be a very curious person. So when someone says, "Hey, don't do that!", you might think, "Hmm... but what happens if I do?" and then go do it just to see what happens. On the other hand, if I don't share my "Never Do This!" list you may stumble upon these things and do them without realizing the implications.

Without further ado, I present Melissa's Never Do This List:

1. Never modify code on the Code tab of a job or transformation.

What are you talking about - I've never heard of this?

If you look at an open job, normally you'll be looking at the Diagram tab. There are three other tabs, too: Code, Log, and Output. If you look at the Code tab you will see all the code automatically generated by DI Studio for the job. And at the very top there's a drop-down menu for the Code generation mode. By default it is "Automatic", but you could change it to "User Written Body" or "All User Written", and then you would be able to modify the SAS code on this tab. Most transformations also have a Code tab with a similar ability to change the Code generation mode.

Why? What bad things might happen?

Once you change the Code generation mode to anything other than Automatic, the job no longer pays any attention to what you might do in the GUI on the Diagram tab. It only runs the code that is on that Code tab. If you change something on the Diagram, you will not see that reflected when the job runs. This can be very confusing if you don't understand how this works or if another user modifies your job without knowing it was on user-written mode. It also completely defeats the purpose of using CDI. CDI automatically generates code; that's part of the core of this product. If you go into that code and tweak it, you're no longer using the product in its intended way.

Can I undo it if I accidentally did this already?

Yes! The fantastic news is that all you have to do is go back to the Code tab and change the Code generation mode back to "Automatic". If you've made changes on the Diagram tab while the mode was User Written, the newly-auto-generated code will pick it up. So, thankfully, this is easily fixed. (Do keep in mind that you will also immediately lose any code you may have written while in User Written mode; be sure to copy and paste it elsewhere if it's something you need to keep.)

What other way could I get my need met?

If there's something you need the job to do that it's not doing, there are other options. First, ask for help. Maybe you just don't know how to get it to work the way you're wanting. Otherwise, you might need a custom transformation to be able to repeat a programming task that DI Studio can't handle with its built in transformations. As a last resort you could use a User Written Code transformation if you truly need a one-off thing that can not be done by DI Studio's other transformations and you never need to repeat. That's the only appropriate place for hand-writing code.

2. Never modify permission settings on the Authorization tab for an object inside the CDI application.

What are you talking about - I've never heard of this?

If you look at the Properties of pretty much any object within CDI, you will see there is an Authorization tab. Listed on the tab is every group or user who has specific permissions defined. Selecting one of these groups or users will show you the permissions they have on the bottom half of the window. If you have ReadMetadata permission to the object, you could modify those permissions settings on the object.

Why? What bad things might happen?

Chances are, someone in your organization has put some serious thought and planning into the permissions model in your CDI environment. If you go changing it here, you could make a big mess of that permissions model. You could deny people access to objects they should have access to, grant access to people who shouldn't have access, and create serious confusion for the administrators who are responsible for the permissions model. You could even accidentally deny yourself access to objects. This could have further implications for objects that depend on each other for access. If you denied access to a user for all the source data in a study, when they open a job in that study, all of the source data would appear missing from the job to them, and they will likely have no idea why. The job doesn't give much helpful information other than to say that the job was modified since it was last opened, and appears unsaved. If the user then saves that job, the source tables will truly be removed from the job for everyone. What a mess!

Can I undo it if I accidentally did this already?

Probably, but it could be messy and you might not be able to do it on your own. The best bet is to get in touch with the administrator who is responsible for the permissions model so that they can take charge of getting the permissions properly reset. There is a utility they can use that can show all of the explicit (non-inherited) permissions throughout the entire system, which can help them verify that the actual permissions are back in line with the permissions model. If you've made a lot of changes, it could be complicated and time-consuming. There is also a difference between permissions set explicitly, inherited, or controlled by an Access Control Template, and you may not know which of those are supposed to be used on the object. I definitely encourage you to reach out to your administrator if this happens, and don't try to fix it on your own without letting them know about the issue.

What other way could I get my need met?

If you feel like the permissions or access are not correct for an object - whether it's a study, folder, table, domain, or anything else - find out who is responsible for the permissions model in your organization and talk to them. They can investigate your concern and fix it or explain to you why the settings are the way they are. If your organization hasn't put a strong permissions model into place, think about doing so. If you need help, reach out to the SAS Technical Consultant working with your company. If you don't have one assigned, reach out to us here on this community and I'm sure we can help.

What other things would you put on the "Never Do This!" list? Please share!

Valued Guide
Posts: 3,208

Re: CDI Best Practices

Melissa,   Nice start for a dedicated environment as of CDI.

I would start with: 

0/ Never trust a sales-person for tools on strategical or technical statements. For example take "21 CFR Part 11" (you must know that)..    

        What are you talking about - I've never heard of this?    

             Well it is about compliancy on your related to your environment as set up by regulators. I would prefer tor refer ISO 27k and NIST aside FDA HIPAA etc.

             A nice starter would be http://www.fda.gov/downloads/medicaldevices/newsevents/workshopsconferences/ucm420828.pdf           

Why? What bad things might happen?

     Being not compliant is one thing as regulators are coming fore some audit.

     The worst thing that can happen is being hacked, data breaches or just leaking all secret information to competitors.

     In the end the result will possible be a big negative business impact.

Can I undo it if I accidentally did this already?

     There is still time to correct things, although business cultures could pose a big challenge.

     You have to become aware of all requirement policies and involved business risks possible impact and probability. When this is a big change process will you be in time?

     The technical impact can be as little as some adjustments, but expect a redesign is needed.

     When suppliers have one delivered some image done image of facades for building then the real thing needed could be missing.

     This is what commonly is missing with SAS they are not ware of ISO-27k SIEM and RBAC (ITSM Cobit) on the business level. This is a serious gap.    

What other way could I get my need met?

     Is there an other way?

    

For the fun review your two mentioned don'ts.

Ad 2/ (security  IST/SOLL)

An Authorisation tab open for modification by a developer/user. There something terrible wrong as the access controls should be designed according requirements and reviewed on unintended changes on a regular task. Suppose there is a technical limitation it is left open for some added  options, the only allowed option is getting it more restrictive. Inherited rules from defined controls is the mandatory way. For the good you can correct mistakes on authorization by removing all your added changes.

Ad 1/ (Aplication Life Cycle management  - DTAP)   

With DI, Data Integration, the usage is only allowed to do changes at the development stage. With testing and acceptance the generated code is accepted as is including some qualitiy checks.

For the business policies and requirements it does not really matter whether you use had build code or DI generated code. It must be underpinned how code has been developed and what quality it is. When there is no applicable transformation available some hand code is allowed when documented en described that way. When is described in business policies the SAS-metadata is leading than you cannot modify that by hand. It should be checked when tested and when seen as some failure to this you are getting that back to explain why you did not do your work as has been agreed in your business.

For both of your mentioned details there is no SAS directive how users should work but a client business policy.

---->-- ja karman --<-----
SAS Employee
Posts: 8

Re: CDI Best Practices

Hello Jaap,

Thank you for participating in this discussion. 21 CFR Part 11 compliance is at the heart of all of our Health and Life Sciences Industry customers' businesses. Most of these businesses have dedicated Compliance and QA departments responsible for evaluating all of their business processes and software solutions to ensure that the companies remain complaint with this regulation. Our SAS Account Executives build strong relationships with our customers and suggest solutions to them that fit their needs; however, ultimate responsibility for 21 CFR Part 11 (and other regulation) compliance lies with each customer.

As for SAS Clinical Data Integration, I have seen companies take different approaches to incorporating this product into their compliant environment. Some use it in conjunction with SAS Drug Development, where CDI is considered a development area. Once a job is finalized in CDI, the Publish to SAS Drug Development feature is used to store the SAS program in the SAS Drug Development repository where it is placed under version control. Some customers use CDI as a standalone product with separate DEV, TEST, and PROD environments. Completed jobs are deployed, which is a process that creates a SAS program from the job. The deployed jobs can also be placed under version control with software such as SVN.

SAS Clinical Data Integration is built on SAS Data Integration Studio as a foundation, with additional plugins that allow the software to take advantage of clinical data standards from SAS Clinical Standards Toolkit. The majority of the functionality in CDI is really DI Studio functionality. The two best practices I mentioned so far are cases where the software works exactly as it is designed to, but from the standpoint of a Health/Life Sciences customer I would recommend avoiding the practices.

Permissions within the CDI/Business Intelligence environment are metadata permissions. In order to be able to modify an object, you must have WriteMetadata permissions to the object. But, with that permission also comes the ability to modify the metadata permissions on the object. This is a different setup than a typical physical file permissions model, where you can grant Write access to an object, but deny the ability to Administer the object's permissions.

There are ways to design the permissions model for an environment so that it is appropriately restrictive for a customer's tastes. That's where a SAS consultant like me can be helpful, having participated in the design for many different customers and understanding fully the implications of different permissions settings and options.

The preferred method for developing code in CDI can be debated, and I can understand the idea that the final product is all that is important, regardless of where custom code might be built in. From my perspective, I try to guide customers to use the product's built in capabilities as much as possible in order to gain the most return on investment from the product and to maintain consistency between users. I have also seen customers run into problems when using User-Written mode on a transformation or job, where the job no longer performs as expected but nobody can understand why. This has led to blocker-level problems where customers have believed the product itself has a serious bug. A single user could get away with a mixture of auto-generated and user-written code if they have a great memory and document their work well; but for a group of users sharing an environment, it's much more likely to lead to problems.

You are right that I am discussing business process best practices and not any kind of SAS directive about how to work! And my suggestions come from my own stumbling with the software over the last few years, common questions and problems that I have seen come from customers, and experience seeing which working practices work best after watching many different clients take different approaches. Ultimately, each client must decide their own best practices, and they don't have to follow my ideas to the letter. And that's the beauty of this discussion board where we can all contribute ideas, even conflicting ones, so that we can all improve our environments and business practices!

Have a wonderful day,

Melissa

Valued Guide
Posts: 3,208

Re: CDI Best Practices

That is a long reply Melissa,  great you are agreeing it is the business needs processes an their compliancy that is we are doing the work for.

I am agree with you, you shouldn't violate the tools design and use it as much it was designed for. It is better adjusting the developer than adjusting the tool.

I reacted rather strong on your post. Sorry for that as it is nothing personal.  

As of my personal experiences it is too often SAS account executives ignored/are ignoring those business compliance guidelines getting into bad relationships or just not really recognizing what is behind the simple words being used.
Having worked with a lot of systems (physical and SAS) I do not agree that you can grant Write access but deny the ability to Administer eg at Unix HFS like systems. You are referring to the often used Windows approach where it can be set different.

With the Metadata security there is lot possible to get it better with newer release it is improved. There is mostly an authentication tab to do something on the admin parts or having other options.

I know several pitfalls and even shortcomings being there. Sorry again as my personal experience was not positive on those getting them recognized. 

It is better for us all to have a positive cooperation SAS and their customers/clients for a more shared goals/targets to further improve win-win for all.          

---->-- ja karman --<-----
SAS Employee
Posts: 8

Re: CDI Best Practices

Thanks, Jaap. Many of us are passionate about our use of SAS and our professional relationship with SAS solutions - including me! And I know that we all sometimes have experiences that aren't as great as we had hoped (don't even get me started talking about trying to buy my last car). At SAS we definitely strive to be among the best companies to work with, and we take great interest in customer feedback.

If anyone ever has questions or concerns about their SAS products, there are several options for getting help.

1. Contact your SAS Technical Consultant (like me).

2. Contact SAS Institute Technical Support. It's a good idea to have your site or customer number handy, as well as the product name and version and the operating system on which it is installed.

3. Contact the SAS Customer Loyalty Team. Our Loyalty representatives advocate for customers, help connect customers with other users, and empower users to get the most out of their SAS software investment.

Thanks!

Melissa

Valued Guide
Posts: 3,208

Re: CDI Best Practices

Melissa,

Thanks again for your response. Yes I know those kind of contacts to access SAS. Sill nice you have brought it up again.

So my response for this is the other way around for a good cooperation. Let me see at you client/customer site there are:

  1. desktop policies in place.
    That is eg a closed desktop with an other party doing application (service-middleware layer) support and rollout
  2. Server policies in place
    That is about hardening a system, a dedicated layout for placing items with an ownership, access control, operational and security monitoring
  3. Networking
      At the networking level there are all kind of segregations and port isolations (firewalls). Requirements on the way of encryptions is added
  4. RBAC (role based access control) as a process playing a role in the organization
    often that is materialized by AD (Windows) or LDAP (Unix) but there can be others around and isolated ones

>>> Why would SAS institute, you and your colleague consultants, not cooperate with the ICT people responsible for that?

What I am seeing this cooperation is having a fight against that. Stating that SAS will replace all that with their own approaches.

A fight or being ignored is not a cooperation. Mentioning the success factors as no IT staff needed is not a cooperation from start. 

You are in the Clinical programming area nothing wrong with that. IT or ICT is and technical area that have a lot of high educations at the same level of medical studies.We do not expect an ICT-specialist to do some medical advice.

>>> Can you explain why a clinical person is assuming to have the same expertise as an real IT-specialist?

That leaves us, when having the same goals for getting SAS better, with the question: how we can cooperate?  

---->-- ja karman --<-----
SAS Employee
Posts: 8

Re: CDI Best Practices

I absolutely agree that our customers' IT departments are a critical and integral part of the implementation of a CDI environment (and really ANY software implementation). In the projects that I have been involved with that have had an onsite installation, the customer's IT department was very involved and well-represented. Since I work in SAS Solutions OnDemand, many of the projects I work with are hosted solutions, meaning the hardware and software reside onsite at a SAS hosting facility and the end-users log into their environment either via a web-based portal or a remote desktop. In those cases, there is typically no software installation done on the customer side, and so their IT department may be much less involved.

It is also worth clarifying that this forum is targeted at end users of the CDI software. These users are NOT typically IT professionals, and so they are not expected to have a full understanding of the permissions model or technical specifications that define the environment. I'm certain we have forums that target platform administrators in case they need help with something.

If you would like to further discuss some of the challenges you're personally facing in your work, please feel free to reach out to me. I can work with you to find your account representative and facilitate a discussion.

SAS Employee
Posts: 8

Re: CDI Best Practices

Alright, let's get back to the discussion of best practices using CDI! I have a long list that I keep handy to help new users. Here are a few more of my favorites, this time on the "DO" side of things.

1. Treat creating a CDI job similarly to how you would create a new SAS program. Know your source data. Know your target data. Have good specifications. A great program (or CDI job) starts with a well-thought-out plan.

2. Understand your domain dependencies, and create your jobs in the order of those dependencies. For example, DM is an input to many other domains, so create your job to populate DM early.

3. Turn off automatic propagation! By default it is turned on. This means that when you connect one transformation to another in your job, all of the columns from the source will automatically map as is to the target. We typically want to be very specific about the columns that we carry through each transformation in our job, and automatic propagation makes that very hard. I suggest turning this off universally by selecting Tools -> Options and then on the Job Editor tab deselecting "Automatically propagate columns" under Automation Settings.

4. Use an Extract transformation as a buffer between every source table and the subsequent transformations in the job. This buffer preserves the metadata information about the source tables so that in the case that the source table is removed from the job or the table metadata object gets changed, moved, or deleted, your job will still retain the table metadata. Imagine if you connect two source tables directly to a Join transformation and code a variety of complex formulas in the Expressions. Then, another user deletes the source table objects from CDI. When you open your job, the source tables will be gone, and when you open your Join transformation, every reference to a source column in the Expressions will be changed to "[missing]". Even if you reconnect the source tables, the Expression will not know which columns belong to each expression. You will have lost that work. However, if you put an Extract transformation off of each source table and directly extract all the columns you might need for the job, without any further transformation or modification, the target of that Extract transformation now contains all the metadata of the source table columns. If the Join is done from those two Extract targets and you lose your source tables, the Join will still contain all of its Expressions and metadata. And all you need to do in order to make the job whole again is reconnect your source tables to their Extract buffers and remap the columns. This is an example of how it benefits you if something goes wrong, but this is also the best way to create a reusable job template. By setting up a job with Extract buffers, you can then remove the source tables used for the development of the job and save it that way. A user can copy that job to a new study and just connect the appropriate source tables into the job to reuse it.

5. Document your work! I know you're not writing code from scratch, but you can still add comments and documentation about your work. When you open any transformation, the General tab displays the Name and Description of the transformation. Modify them! Don't leave every Extract transformation named "Extract". Name it "Extract lab dataset", or "Map to DM domain columns" so that it's clear what that transformation is doing. You can also add important information into the Description. Another great tool is the Sticky Note. It adds a small yellow comment to the job, which can help alert other users to important information about the job. Whatever methods you choose, treat this as you would any other SAS program and document what you are doing as clearly as you can so that it is useful to other users and even yourself if you have to revisit your work weeks or months later. How many of us remember what we were thinking when we created a job six months ago?

What other end-user best practices do you have? I can still go on for days!

Melissa

Ask a Question
Discussion stats
  • 7 replies
  • 1255 views
  • 0 likes
  • 2 in conversation