Hi,
We are in the process of finding the jobs which are exited(terminated) and the reason behind the job exited. For analyzing purpose, need the list of exited jobs & their ID's and the reason behind it.
Please help us in finding these exited jobs using the LSF, as there are lot of jobs which are terminated automatically. Also let us know how to proceed further in getting these details using LSF.
Thanks in Advance,
Siddhu
How about running (adjusting the range accordingly): bhist -t -T "$(date -d "1 week ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | egrep -o 'Job.[<>0-9]+.exited.with.exit.code.[0-9]+'
[sas@trcv003 ~]$ bhist -t -T "$(date -d "1 week ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | egrep -o 'Job.[<>0-9]+.exited.with.exit.code.[0-9]+' Job <36296> exited with exit code 102 Job <36297> exited with exit code 102 Job <36299> exited with exit code 102 Job <36300> exited with exit code 102 Job <36301> exited with exit code 102 [sas@trcv003 ~]$
Thanks for your reply Greg.
As, the number of jobs has been terminated(sometimes by user and system) daily, so need to find out the list of job id's for the business to analyse with the reason as well.
So, need to know which are exited jobs( terminated jobs) with exit codes and job id's using the script or SAS Code that will be helpful. Can you let us know the list of exit codes as well using the LSF for the grid jobs.
Thanks in Advance,
Siddhartha
Thanks for the reply Greg.
I was executing with the command provided by you.
I just want to understand the reason for the below LSF exit codes: 1 ,2, 3, 255, 231 & 5 taken from the server.
Job 929621 exited with exit code 1;
Job 916477 exited with exit code 2;
Job 930573 exited with exit code 5;
Job 916724 exited with exit code 255;
Job 926729 exited with exit code 3;
Job 922320 exited with exit code 231;
Thanks in Advance,
Siddhu
It is difficult to say what the code means when you do not know what command was being executed.
If a grid job executes the command 'sasgrid' and returns a return code greater than 2, a log will be stored in <config>/<LevX>/<AppServer>/GridServer/Logs that may provide more information.
You need to look at each job individually to see what command was executed, where it was executed (in case jobs on a certain host all fail), when it was executed, and how it ended (was it killed by the user?).
[sas@trcv003 ~]$ bhist -l $(bhist -t -T "$(date -d "2 weeks ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | egrep -o 'Job.[<>0-9]+.exited.with.exit.code.[0-9]+' | sed 's/[<>]//g' | cut -d' ' -f2) |sed ':a;N;$!ba;s/\n //g' | egrep -o '(^Job.<[0-9]+>.*User.<[A-Za-z0-9]+>|Exited.with.exit.code.[0-9]+)' | sed ':a;N;$!ba;s/\nExited/ exited/g' Job <36178>, Job Name <SAS Enterprise Guide_SASApp - Workspace Server_0B496297-E9D2-3048-B8FC-2358884D8AD2>, User <sassrv> exited with exit code 1 Job <36296>, Job Name <test2>, User <sas> exited with exit code 102 Job <36297>, Job Name <test2>, User <sas> exited with exit code 102 Job <36299>, Job Name <test2>, User <sas> exited with exit code 102 Job <36300>, Job Name <test2>, User <sas> exited with exit code 102 Job <36301>, Job Name <test2>, User <sas> exited with exit code 102 [sas@trcv003 ~]$
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.