Architecting, installing and maintaining your SAS environment

Scheduled Job (Platform Process Manager) did not start

Accepted Solution Solved
Reply
Regular Contributor
Posts: 175
Accepted Solution

Scheduled Job (Platform Process Manager) did not start

We have numerous flows (150-200) that are triggered by all kinds of triggering events.  Recently I saw some weird behavior where certain flows which are scheduled to run using the standard Daily@Sys calendar (varying times of the day) just simply did not start. There is no record in Flow Manager showing that they even tried (i.e. no history).  They've been running fine for a long time, but simply did not run on a given day. Then they ran fine the next day?

 

Does anyone know how to troubleshoot/investigate?  I'm guessing there are relevant logs somewhere within the PM product install directory, but I'm not as familiar with this product as I am with SAS products.

 

My first hunch is that maybe there's a load/server limit somewhere?  I would kind of doubt it, though, since the latest instance had to do with a job at 12:05 am, and we do not have many jobs scheduled for that time.

 

In case it's relevant, we are on SAS 9.4M4, Platform Suite 9.1.3, Linux x64 servers in a grid environment.

 

Thanks!


Accepted Solutions
Solution
2 weeks ago
Regular Contributor
Posts: 175

Re: Scheduled Job (Platform Process Manager) did not start

Posted in reply to Timmy2383

SAS Tech Support recommended I take various steps to clear out the JFD/Process Manager history and cache. I followed their recommendation and so far no issues.  Here's what they told me:

 

1. Backup/delete lsb.events.* only keep lsb.events and lsb.events.1
2. Backup and Delete the following two files
                (a) $JS_HOME/work/system/jobidmap.dat.1
                (b) $JS_HOME/work/system/ lsf.events
3. Backup/Delete all files in $JS_HOME/work/history except created at the last 3-5 days
4. Backup/Delete all files in $JS_HOME/work/events except created at the last 3-5 days, only keep js.events
5. Backup/Delete all files in $JS_HOME/work/variable except created at the last 3-5 days
6. Backup/Delete all files in $JS_HOME/work/storage/error except created at the last 3-5 days
7. Backup/Delete all files in $JS_HOME/work/storage/flow_instance_storage/finished , leave 5 days
8. Backup/Delete $LSF_HOME/work/<cluster>/logdir/lsb.event.* except created at the last 3-5 days ($LSF_HOME D=\LSF_51\work\cluster1\logdir)
9. Backup/Delete all files in $JS_HOME/work/storage/cache/
10. Backup/Delete $JS_HOME/log file located there!!

View solution in original post


All Replies
Trusted Advisor
Posts: 1,424

Re: Scheduled Job (Platform Process Manager) did not start

Posted in reply to Timmy2383

Hello @Timmy2383,

 

if you are on grid, there might be some limits on job execution, they can be placed on the top of a queue, but they will be triggered, and they will execute at a certain point. There is a difference between the Process Manager (JS) and the LSF resource manager. The process manager won;t act much differently than any other scheduler, such as cron or at. The only difference is that you can customise calendars.

 

First, I would check the jfd file (you can use locate, however it will be on your JS_Top/work directory.

 

Second, I wonder if you could check what other things happened on the server at the point when the job did not run. Example: the jfd service was stopped, hence the not triggering of the job. Or maintenance on the server (security patches?).

 

 

Regular Contributor
Posts: 175

Re: Scheduled Job (Platform Process Manager) did not start

Posted in reply to JuanS_OCS

Thanks, Juan.

 

There definitely wasn't any maintenance going on. It's possible the JFD stopped for some reason but was restarted by EGO, not sure how I could determine that, though.

 

So far SAS TS has recommended clearing out many of the PM logs and cache. I will have do this during the next scheduled maintenance window and then monitor after that.

 

 

Solution
2 weeks ago
Regular Contributor
Posts: 175

Re: Scheduled Job (Platform Process Manager) did not start

Posted in reply to Timmy2383

SAS Tech Support recommended I take various steps to clear out the JFD/Process Manager history and cache. I followed their recommendation and so far no issues.  Here's what they told me:

 

1. Backup/delete lsb.events.* only keep lsb.events and lsb.events.1
2. Backup and Delete the following two files
                (a) $JS_HOME/work/system/jobidmap.dat.1
                (b) $JS_HOME/work/system/ lsf.events
3. Backup/Delete all files in $JS_HOME/work/history except created at the last 3-5 days
4. Backup/Delete all files in $JS_HOME/work/events except created at the last 3-5 days, only keep js.events
5. Backup/Delete all files in $JS_HOME/work/variable except created at the last 3-5 days
6. Backup/Delete all files in $JS_HOME/work/storage/error except created at the last 3-5 days
7. Backup/Delete all files in $JS_HOME/work/storage/flow_instance_storage/finished , leave 5 days
8. Backup/Delete $LSF_HOME/work/<cluster>/logdir/lsb.event.* except created at the last 3-5 days ($LSF_HOME D=\LSF_51\work\cluster1\logdir)
9. Backup/Delete all files in $JS_HOME/work/storage/cache/
10. Backup/Delete $JS_HOME/log file located there!!

Trusted Advisor
Posts: 1,424

Re: Scheduled Job (Platform Process Manager) did not start

Posted in reply to Timmy2383

Same old nice trick Smiley Happy . Thanks for sharing @Timmy2383!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 280 views
  • 1 like
  • 2 in conversation