Hi @ballardw and TYVM for the quick reply,
You may have to look for an environment change, not just a SAS code or SAS setting change. In the world of clients and servers anything with an impact on the server can slow other jobs. Such as adding another program not related to this process at all that uses either the server or network bandwidth or storage space.
I would ask the IT management if any programs were added or modified that use the SAS server or what ever systems generate the files you need to read daily.
We know for sure our IT is "working more" on the traditional-Host-to-SAS side, since some time (a month or two): our company data stay into their usual Host/DBMS (DB2, Oracle, etc.), to get worked on by IMS and CICS accounting transactions but, since some time, they're also copying previous-day "final output" data into SAS data-lake (so Teradata, Hadoop), so that we can access them via EG projects: such workload (which I've no idea at which times it's happening) sure encumbers SAS "servers".
I call them "servers" but they can be whatever: from a cluster of usual linux servers, to a cloud, up to host-grade ones, still running Linux. I'm very far from the IT architecture, so I've no idea about which machines are they using, to run SAS engine(s). For sure they're using Linux as it's reported by EG, while our EG-Scheduler doesn't use classic linux cron but relies on Windows Task Scheduler (so a vbs program launching EG and passing the path/projectname to execute, along with user/password to open an EG session for the project).
Even for sure, the IT dept. working on SAS is not small, and they're using many machines, just I've no idea which grade they are but I doubt they're classic "servers".
So "to ask the IT management" it's totally not easy at all: I opened a ticket to our HelpDesk, which escalated to some "SAS IT" but the most they answered was: 1. try scheduling in different times (giving us no clue about which hour was to better use)
2. try segmenting the Big Project into some smaller chunks, which sounds like passing the buck back to our hands without giving any objective hint on the problem (and which could take some months, also considering there is no way to "arbitrate" the various chunks, via WTS, which "just executes" but don't "manages")
My complete guess could be that there is something delaying the creation of the files you use but the ones from the previous cycle are available. So the SAS end starts normally, using that previous data, and does what it is told.
I've the same feelings: the error seems generated by some "timeout" which could happen between SAS engine and SAS-EG, 'cause the engine is too much delayed by other heavy jobs, but that's my feeling only: I've no evidences, can't have them directly, and our HelpDesk didn't give any back.
You didn't actually describe what was received in stead of "cause expected data have not been delivered". Has your customer mentioned, or been asked, if the data delivered was identical to the previous day's data? If so something in your system is introducing either a delay on generating the "new" files in time.
So, in brief: nearly everyday (on need, but often) our customer fills in an Excel file we left into his "input folder" with some contract-codes they wants to be "deep-searched" (24 months from today()) into our transactions (we're anti money landering dept.), the extracted transactions are given, raw, splitted into previous/current years (2 files), then we also group them in 4 different ways, as asked by the customer: such process outputs, starting from the single account-code, 7 different files. Contract codes are usually from 3-4 up to 15-20, so we're having 20-30 up to 140 output files per day, depending to the input.
All, up to when there was no errors, was done in 30-40 minutes so, around 2am, all files was ready: our customers (colleagues) start working ar 8am so it was all good, to them: they moved the output files into their own folders and the process was going smooth everyday (there are some "exceptions" which we're taking care, producing 2-3 output "exceptions-lists" but that's a side thing, having no real impact on the process). Files are named after their contract-codes, so no problem about overwriting them too.
Nowadays, with the error plaguing the scheduled project (but very rarely impairing us, when we execute it by hand, during the day!), in the morning our customer see the output folder empty ,if errors happened in the core-step, where data preparation is made for the following grouping ones, or they can see some files only, if the errors, which happens totally random, even when we schedule the process 2 times (1am and 3am), e.g. all codes generated 1st, 2nd, and 4th group-by data only, but all lacks 3rd grouping, etc.
For example, this morning 1am run on 6 codes produced the 2 years raw data (2 files), got error in 1st and 2nd grouping, produced 3rd and 4th group data (6+12 files), plus the exceptions ones (total 22-23 files, exceptions included). So, when my colleague arrived at office, he have been asked to manually rerun the project: he did, and EG produced the whole 54 files (49 grouped data + 2 raw yearly + 3 exception lists).
The fact, when lanched manually, the projects works seamlessy brings me to the point that there is no error, in the "application layer": the error (delay, timeout) is underneath, in the engine or DBMS layer, more probably: where we can't put our hands at all.
You may want to contact SAS tech support to see if they can help you generate better diagnostics to see exactly what is happening in your system to isolate relevant issues.
I've no direct access to SAS tech support, AFAIK. I'm trying to ask to some IT colleagues I know but, as I'm not their "direct customer" (no contract, no budget, to them) they may give me a favor, to answer, of even won't: not their problem.
... View more