Hi
We have recently installed LSF to run codes for our business users.
Jobs in LSF have 3 different exit codes:
0 - Successfully
1- Finished with warnings
2 - Finished with ERRORS.
Some of user codes generate WARNINGS which are excepted. Is it somehow possible to force LSF to except warnings?
And another question is about combination of complete+rerun.
I.e i have a job that failed. I want to click on Complete and then rerun, to remove this job from "Exit"-state. Is it some extra configuration steps i need to do? Now i only get "No starting points or exited items. Flow cannot rerun". I don't want to rerun it, i just want to move it from exit to Done.
Agree with you on this. But for some reason Complete and rerun flow doesn't work here 😞
@irinaia wrote:
Some of user codes generate WARNINGS which are excepted.
Sloppy and bad programming habits that have to be dealt with. Restructure you code(s) so that WARNINGs are either not generated at all, or use SAS options to suppress them selectively where they can't be avoided.
And don't try to tell anyone this isn't possible. It IS possible, period. You just have to work at it.
Is it somehow possible to force LSF to except warnings?
That would be an extremely foolish thing to do, as it will cause "unexpected" WARNINGs to go undetected. Never do such a thing.
I agree with the many responses that it is not a good practice to suppress warnings. On the other hand I am also practical. The issue at hand is that LSF does not distinguish between ERROR and WARNING. Both are rc not equal to 0 and thus interpreted as an unsuccessful job. As these tend to disrupt the flow of jobs one would experience a severe penalty on warnings. Often too severe.So we do not halt on warnings.
To avoid jobs to have systematic warnings due to sloppy programming one can prevent such a job from passing acceptance testing and not make it into production. The warnings that one is left with should never go unnoticed but should not always disrupt a production run.
So what to do?
We have a two-pronged approach. One is to reset rc=1 to rc=0 at the OS level so warnings are not interpreted as errors by LSF. The other is to log jobs results with the original rc in a database and report on that. In fact we use the EOM DI Monitor to do that in real time. A browser-based user interface shows jobs that ended in warning as yellow (eventhough the return code is reset to 0) and spur investigation and remedy.
Hope this helps,
- Jan.
or just open the Lev1/SASApp/BatchServer/sasbatch.sh file and uncoment the code which handles the return code 1 (SAS Warning) and change it to 0
Hello @irinaia,
the exit codes can be controlled as mentioned by @nfarinha, or touched by @jklaverstijn, it is documented in the SAS Usage Note 24391: Changing an exit status when using the Schedule Manager Plugin http://support.sas.com/kb/24/391.html
I understand the problem, there are many WARNINGs generated by SAS, all kind. Which leads to the stop of job execution in LSF.
The main reason is because in many, many kind of companies, WARNINGs are not allowed at all, as in Life Sciences and some blue chip companies.
So you can workaround that restriction with the modification of the result codes (rc), as described by the SAS Note.
For whom might be interested, my personal interepretation of the references to "sloppy programming/code" provided by @Kurt_Bremser and @jklaverstijn might sound hard to some, but they are candid and really aiming to help, from the point of view that a workaround like that, in the SAS note, is something that would allow your SAS jobs to finish (and widely used) but, that also opens the door to "unseen" malfunctions that you would identify much-much later in your data, leading to unnumerous headaches. Hence, that reference to "sloppy code". It is a good advise, and a positive critic; strong and consistent code would allow everyone less concerns on the Data Govermance, or the Data Management side at anyone's company's business processes.
Just to add some clarification: I am responsible for ~ 1000 SAS batch jobs, and not one of them ends with a WARNING return code of 1 on successful execution. We have even reduced the use of WARNING-suppressing options (like DKRICOND) by checking metadata first and executing code conditionally.
Anything we didn't expect and explicitly code for has to throw a non-zero return code, so we don't have one little problem permeating through the whole processing chain and wreaking unnoticed havoc. We go so far to add additional error-checking shell script syntax to catch certain phrases in the SAS log in order to be alerted to conditions that SAS itself does not recognize as "wrong". This includes (amomng others) FTP return codes that only happen when connecting to IBM z/OS systems.
The SAS codes are mainly doing ETL, but there's also a lot of statistic, print output, graphics.
So it is possible to do.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.