BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bdoug
Obsidian | Level 7

Something is up with our SAS Grid (9.4M3).  It is a new installation and even a simple "bsub sleep 20" request remains in PEND status.

 

bjobs -l 768

 

PENDING REASONS:
New job is waiting for scheduling;

 

SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -

RESOURCE REQUIREMENT DETAILS:
Combined: -
Effective: -

 

Is there an LSF command line command that will give detailed information on why the job is not being dispatche, like the criteria LSF is using to keep the job in the pending status?

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

@bdoug wrote:

That is interesting, but not our issue.  It turns out we had a NTP issue on our servers, which confused LSF.

 

Still wonder is there is a command to get LSF to tell you why it is not dispatching a job. 


How should a dispatcher do this when the server's time is not correctly set? If it knew it should run a job, it would run the job. If the time is not right for the job, then the job shall not be run anyway, and no message needs to be sent.

If you had the dispatcher send you a message for every job it does not run at the moment for any reason, you'd be drowned in messages in seconds.

View solution in original post

5 REPLIES 5
SASKiwi
PROC Star

This may be a long shot but this problem sounds very similar to one we struck. There is a weird bug in LSF at least up to V9.1 which happens when you configure it to send emails and an email error occurs. LSF creates a file called PROGRAM and stores it in the same root folder where LSF is installed (eg on Windows C:\Program Files) and writes the email SMTP error in it.

 

This rogue file called PROGRAM then blocks any use of LSF thereafter as it is executed instead of the real LSF program! Simply renaming the file will fix the problem. 

bdoug
Obsidian | Level 7

That is interesting, but not our issue.  It turns out we had a NTP issue on our servers, which confused LSF.

 

Still wonder is there is a command to get LSF to tell you why it is not dispatching a job. 

Kurt_Bremser
Super User

@bdoug wrote:

That is interesting, but not our issue.  It turns out we had a NTP issue on our servers, which confused LSF.

 

Still wonder is there is a command to get LSF to tell you why it is not dispatching a job. 


How should a dispatcher do this when the server's time is not correctly set? If it knew it should run a job, it would run the job. If the time is not right for the job, then the job shall not be run anyway, and no message needs to be sent.

If you had the dispatcher send you a message for every job it does not run at the moment for any reason, you'd be drowned in messages in seconds.

bdoug
Obsidian | Level 7

Maybe instead of sending a message, supply the reason for pending in extended bjobs information?  bjobs -l <ID>

 

Therefore, I can see why a job is pending but only when I ask.

JuanS_OCS
Amethyst | Level 16

Hello @bdoug,

 

I wonder if this information would help you out: http://www-01.ibm.com/support/docview.wss?uid=isg3T1016430

 

Besides this some items from my personal experience:

 

- I would check at IBM site and asking SAS Technical Support if there is any recommended patch available for your LSF version... normally there is something.

 

- Also on a Linux installation, I found this several times because some part of the installation and configuration was not done properly (normally, on the LSF config or pre-requisites).

 

- Please check also your queues config... it might be that the job is not going to the queue you expect, or just hanging there forever.

 

So I would propose: while you investigate this on your side or with us, I would contact SAS Technical Support, normally they have nice insights.

 

Best regard,

Juan

 

 

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2532 views
  • 0 likes
  • 4 in conversation