Re: How to Handle Running Jobs When the SAS Server Needs to be Reboote...

dgower · Posted 06-09-2015 12:53 PM

We have our GRID nodes running on Windows (I know). So they need to be rebooted periodically. What is the best way to handle running jobs when we need to restart the servers?

ChrisHemedinger · Posted 06-09-2015 01:03 PM

Moved this to since I don't think it's specific to ITRM, and there are more folks watching that group that can answer.

Chris

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

jakarman · Posted 06-09-2015 03:49 PM

There is no need to reboot the Windows machines. You can have them running for a long period.

Problems are often within the "apolications" like sas. The cause can be memory problems or synchronisation issues. The real solution is a developers question, in this case sas TS and the developers of the SAS system.

Needing to bypass issues in SAS you could plan to restart all SAS servers better word services. Eg the metadata server. In that case your batch processes will not be affected.

Needing a planned outage of the os you can plan that so cancelling running jobs is an expected event

---->-- ja karman --<-----

Kurt_Bremser · Posted 06-10-2015 03:05 AM

What is a "running job"? A scheduled batch job, or something initiated interactively by SAS VA or Enterprise Guide?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

dgower · Posted 06-10-2015 08:27 AM

Both, we have scheduled batch jobs and users running jobs interactively. So I'm wondering if there's a way, or a best practice, to stop processing new jobs say 30 minutes prior to bouncing the servers and a way to "gracefully" stop existing jobs immediately prior to restarting the servers. Thanks for your reply.

Kurt_Bremser · Posted 06-10-2015 10:02 AM

Batch jobs should always be written in a way that allows them to crash or be stopped unexpectedly, and be rerun without causing damage to data. Eg new observations added to a table should "know" which run added them, and a repeat of that run can filter them out before repeating the table update.

With interactive sessions you can't really know what timespan is right. Some jobs take seconds, some literally days.

In that context I'd like to see a tool that allows a SAS administrator to send messages to metadata-driven clients like EG.

Right now, one has to develop methods to do that outside of SAS or with the use of external commands (like running a ps on UNIX that finds the workspace servers, deducts the userid's, finds the email adresses of those and sends email that the server will be going down).

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

jakarman · Posted 06-10-2015 01:02 PM

Kurt eguide with grid and parallel code submission is not an interactive only approach anymore. It is more doing batch work. That is the flow processing in eguide and studio offering. Batch processing by selfservice.

There is an advanced topic for dower to think about. That is checkpoint restart in SAS. by that you should be able to cancel long running jobs. The topic is an advanced one with a lot of pre reqs. The only event I have seen checkpoint restart being used is with mainframe job scheduling having jobs for several days to run.

---->-- ja karman --<-----

How to Handle Running Jobs When the SAS Server Needs to be Rebooted?

Re: How to Handle Running Jobs When the SAS Server Needs to be Rebooted?

Re: How to Handle Running Jobs When the SAS Server Needs to be Rebooted?

Re: How to Handle Running Jobs When the SAS Server Needs to be Rebooted?

Re: How to Handle Running Jobs When the SAS Server Needs to be Rebooted?

Re: How to Handle Running Jobs When the SAS Server Needs to be Rebooted?

Re: How to Handle Running Jobs When the SAS Server Needs to be Rebooted?