BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Rodcjones
Obsidian | Level 7

I have a growing number of SAS enterprise guide .egp projects that are set on a schedule. Unfortunately many of them depend on very dynamic databases – by which I mean there are relatively frequent changes to tables and columns. There are also less frequent ETL failures. These collectively have the impact of messing with the outputs and deliveries that the .egp’s produce.

I would like to incorporate data integrity checks into all of my process flows that would halt processing and notify me/the admin that the task did not complete. I am thinking there are probably two kinds of checks.

  1. Make sure one (or multiple) datasets have the expected number of records (or meet a minimum threshold of number of records).
  2. Make sure no errors have been logged at one or more points in the process flow.

My questions to the community are  

  1. Is my perception of the kinds of integrity checks good (or are there other kinds of integrity checks I should also be considering)?
  2. Are the best practices for achieving this?
  3. Aside from the integrity checks, are there recommended approaches to halting a process flow? This question comes from some challenges I see when the .egp contains a relatively large number of programs, rather than a small number of programs with %include statements that could be enclosed in a macro that is based on the results of the integrity check. In other words is it possible to enclose a series of *programs* within a macro using the SAS EG process flow functions?

 

SAS 9.4, EG 7.1, Server 2012 R2

 

Thank you all for taking a look.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

I fully endorse @Kurt_Bremser's response. I highly recommend you check out SAS server-based scheduling instead of using your PC. This is particularly true if you are running business-critical jobs.

 

I'm also a firm believer in simple options for controlling SAS jobs. Please look at the SAS system options SYNTAXCHECK and/or ERRORABEND. SYNTAXCHECK switches SAS into syntax check mode when an error occurs while ERRORABEND will abort the SAS job. Both of these options are very useful for controlling scheduled SAS batch jobs.

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

You have reached a stage where you need a real scheduler that can handle dependencies.

As said in the other thread, convert everything to .sas code files that can be run in batch. Write your code so that it runs without ERRORs and WARNINGs when successful. Make your ETL jobs so that they fail when the DB structure changes (eg always use a comprehensive variable list in SQL selects, never select *). In this way you catch changes early in your analytic chain.

 

Since I get my data in flat-file unloads from the DB, it is even easier to detect changes with the read data steps.

SASKiwi
PROC Star

I fully endorse @Kurt_Bremser's response. I highly recommend you check out SAS server-based scheduling instead of using your PC. This is particularly true if you are running business-critical jobs.

 

I'm also a firm believer in simple options for controlling SAS jobs. Please look at the SAS system options SYNTAXCHECK and/or ERRORABEND. SYNTAXCHECK switches SAS into syntax check mode when an error occurs while ERRORABEND will abort the SAS job. Both of these options are very useful for controlling scheduled SAS batch jobs.

JuanS_OCS
Amethyst | Level 16

Hello @Rodcjones,

 

from the technical point of view, I am sure @SASKiwi and @Kurt_Bremser can provide help.


From the Architecture point of view, I think you probably are familiar with the concepts Data Warehouse (DWH), Datamart (DM) and single version of the truth.

 

DHWs are generally dynamic as you say, but for reporting and data analysis, you would need to have your own version of the truth, updated only under your control (daily, 15 min, whatever you need). That "single version of the truth" it is a momentary version of the data, a snapshot, and it is normally called Datamart. Just the data you will need.

LinusH
Tourmaline | Level 20
Can you describe more the concept of "very dynamic databases"?
It doesn't sound like a robust architecture. Some dynamics are ok in the presentation layer, where you often can refresh the data.
Usually, a through data modelling practice can solve some of the most vulnerable parts, together with a clear maintenance operation model.
Data never sleeps
Rodcjones
Obsidian | Level 7

Thanks for the question. The architecture is robust. What I meant to convey is that the jobs scheduled are associated with quality improvement projects with an intentionally high rate of change in how items are documented and stored within the database. New workflows are implemented frequently, and while the intention is to communicate the downstream impacts to developers, sometimes the impact is not understood right away or not communicated. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1065 views
  • 6 likes
  • 5 in conversation