04-26-2018 11:32 AM
We have recently installed LSF to run codes for our business users.
Jobs in LSF have 3 different exit codes:
0 - Successfully
1- Finished with warnings
2 - Finished with ERRORS.
Some of user codes generate WARNINGS which are excepted. Is it somehow possible to force LSF to except warnings?
And another question is about combination of complete+rerun.
I.e i have a job that failed. I want to click on Complete and then rerun, to remove this job from "Exit"-state. Is it some extra configuration steps i need to do? Now i only get "No starting points or exited items. Flow cannot rerun". I don't want to rerun it, i just want to move it from exit to Done.
04-27-2018 02:21 AM
05-14-2018 05:02 AM
Some of user codes generate WARNINGS which are excepted.
Sloppy and bad programming habits that have to be dealt with. Restructure you code(s) so that WARNINGs are either not generated at all, or use SAS options to suppress them selectively where they can't be avoided.
And don't try to tell anyone this isn't possible. It IS possible, period. You just have to work at it.
Is it somehow possible to force LSF to except warnings?
That would be an extremely foolish thing to do, as it will cause "unexpected" WARNINGs to go undetected. Never do such a thing.
05-14-2018 05:40 AM
I agree with the many responses that it is not a good practice to suppress warnings. On the other hand I am also practical. The issue at hand is that LSF does not distinguish between ERROR and WARNING. Both are rc not equal to 0 and thus interpreted as an unsuccessful job. As these tend to disrupt the flow of jobs one would experience a severe penalty on warnings. Often too severe.So we do not halt on warnings.
To avoid jobs to have systematic warnings due to sloppy programming one can prevent such a job from passing acceptance testing and not make it into production. The warnings that one is left with should never go unnoticed but should not always disrupt a production run.
So what to do?
We have a two-pronged approach. One is to reset rc=1 to rc=0 at the OS level so warnings are not interpreted as errors by LSF. The other is to log jobs results with the original rc in a database and report on that. In fact we use the EOM DI Monitor to do that in real time. A browser-based user interface shows jobs that ended in warning as yellow (eventhough the return code is reset to 0) and spur investigation and remedy.
Hope this helps,