BookmarkSubscribeRSS Feed
siddhartha77
Calcite | Level 5

Hi,

 

We are in the process of finding the jobs which are exited(terminated)  and the reason behind the job exited. For analyzing purpose, need the list of exited jobs & their ID's and the reason behind it.

 

Please help us in finding these exited jobs using the LSF, as there are lot of jobs which are terminated automatically. Also let us know how to proceed further in getting these details using LSF.

 

Thanks in Advance,

Siddhu

8 REPLIES 8
gwootton
SAS Super FREQ

How about running (adjusting the range accordingly): bhist -t -T "$(date -d "1 week ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | egrep -o 'Job.[<>0-9]+.exited.with.exit.code.[0-9]+'

[sas@trcv003 ~]$ bhist -t -T "$(date -d "1 week ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | egrep -o 'Job.[<>0-9]+.exited.with.exit.code.[0-9]+'
Job <36296> exited with exit code 102
Job <36297> exited with exit code 102
Job <36299> exited with exit code 102
Job <36300> exited with exit code 102
Job <36301> exited with exit code 102
[sas@trcv003 ~]$
--
Greg Wootton | Principal Systems Technical Support Engineer
siddhartha77
Calcite | Level 5

Thanks for your reply Greg.

As, the number of jobs has been terminated(sometimes by user and system) daily, so need to find out the list of job id's for the business to analyse with the reason as well.

So, need to know which are exited jobs( terminated jobs) with exit codes and job id's using the script or SAS Code that will be helpful. Can you let us know the list of exit codes as well using the LSF for the grid jobs.

 

Thanks in Advance,

Siddhartha

gwootton
SAS Super FREQ
The log file for the job will typically indicate the failure.

The most common non-zero exit codes for SAS are:
1 - WARNING during execution
2 - ERROR during execution

The exit code of a SAS session can also be adjusted by the SAS code being executed.

100 level codes indicate a problem during startup of SAS, like pointing it to an inaccessible file or directory for WORK, LOG, LOGCONFIGLOC, PRINT, etc.

If LSF can't run the command being submitted to it (i.e. <SAS-Config>/Lev1/SASApp/WorkspaceServer.sh), I think that throws a 127 exit code.

Return Codes and Completion Status
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/hostwin/n0d8f2zgsbjtqwn1f384lia2nlbm.htm
--
Greg Wootton | Principal Systems Technical Support Engineer
siddhu1
Quartz | Level 8

Thanks for the reply Greg.

I was executing with the command provided by you.

I just want to understand the reason for the below LSF exit codes: 1 ,2, 3, 255, 231 & 5 taken from the server.

 

Job 929621 exited with exit code 1;
Job 916477 exited with exit code 2;
Job 930573 exited with exit code 5;
Job 916724 exited with exit code 255;
Job 926729 exited with exit code 3;
Job 922320 exited with exit code 231;

 

Thanks in Advance,

Siddhu

doug_sas
SAS Employee

It is difficult to say what the code means when you do not know what command was being executed.

  • As Greg said, 1 & 2 could be SAS return codes for 'warnings in SAS execution' and 'errors in SAS execution' respectively.
  • 255 could be that the job was killed for some reason. Maybe the user terminated the job while it was running.
  • 231 is usually the value returned when the job returned a 999 exit code (return codes have to fit in one byte so 999=0x3E7 but since there is only one byte for return codes it ends up as 0xE7 = 231)

If a grid job executes the command 'sasgrid' and returns a return code greater than 2, a log will be stored in <config>/<LevX>/<AppServer>/GridServer/Logs that may provide more information.

 

You need to look at each job individually to see what command was executed, where it was executed (in case jobs on a certain host all fail), when it was executed, and how it ended (was it killed by the user?).

siddhu1
Quartz | Level 8
Thanks for the reply Greg.
Can I get the username as well with the job id and exited with exit job.

Kind Regards,
Siddhartha
gwootton
SAS Super FREQ
The command I provided is manipulating the output of the bhist command. This output also contains the user ID so you could adjust this command to grab whatever you like from it.
The bhist command reads from a history file, so you could also parse that file directly.
--
Greg Wootton | Principal Systems Technical Support Engineer
gwootton
SAS Super FREQ
[sas@trcv003 ~]$ bhist -l $(bhist -t -T "$(date -d "2 weeks ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | egrep -o 'Job.[<>0-9]+.exited.with.exit.code.[0-9]+' | sed 's/[<>]//g' | cut -d' ' -f2) |sed ':a;N;$!ba;s/\n                     //g' | egrep -o '(^Job.<[0-9]+>.*User.<[A-Za-z0-9]+>|Exited.with.exit.code.[0-9]+)' | sed ':a;N;$!ba;s/\nExited/ exited/g'
Job <36178>, Job Name <SAS Enterprise Guide_SASApp - Workspace Server_0B496297-E9D2-3048-B8FC-2358884D8AD2>, User <sassrv> exited with exit code 1
Job <36296>, Job Name <test2>, User <sas> exited with exit code 102
Job <36297>, Job Name <test2>, User <sas> exited with exit code 102
Job <36299>, Job Name <test2>, User <sas> exited with exit code 102
Job <36300>, Job Name <test2>, User <sas> exited with exit code 102
Job <36301>, Job Name <test2>, User <sas> exited with exit code 102
[sas@trcv003 ~]$
--
Greg Wootton | Principal Systems Technical Support Engineer

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 4667 views
  • 3 likes
  • 4 in conversation