BookmarkSubscribeRSS Feed
yanamadala85
Obsidian | Level 7
 

I am using RSUBMIT method to distribute the jobs parallel. I have scenario where I am not able to find the solution.

1) I submitted the RSUBMIT blocks and distributed the jobs to different grids

2) due to some reason middle of the programs execution one of the grid node is unavailable/failed

 

Is there any way to find the unavailable/unresponsive/ Idle grid node?

8 REPLIES 8
JuanS_OCS
Amethyst | Level 16

Hello @yanamadala85,

 

yours is a very good question. I think your answer is in checking the status of the Connect Spawner and the Object Spawner in every node of your grid.

 

UI-wise, RTM or Grid Monitor will help you. 

 

Now, command-line-wise, you could go for one of these options:

- check the logs of the spawners

- check the status of the services with ego commands

- Run a Gridtest_fast.sas or Gridtest.sas before executing your program  https://support.sas.com/rnd/scalability/grid/download.html however I think this might be a bit overkilling if you plan to run it before every program.

 

Nevertheless, I think that besides the checks, you rather would  configure High Availability (HA) in your grid environment, precisely to ensure that,if any grid or service goes down, the service will start in other machine for the time being. On this way, your programs should not fail.

 

For all of that and more, you can check:

https://support.sas.com/rnd/scalability/grid/HA/gridha.html

 

 Please note that most of those ways are for the administrators of your environment to execute. I mean, a programmer/user should not really be concerned about availability of the environment. If technical problems arise, you open a support ticket and other team would take care of it in your behalf. 🙂

 

yanamadala85
Obsidian | Level 7

Hi  

We are not using the EG UI for submitting the SAS jobs. In this Scenario I can not event configure the (HA)  and I can't check the logs of the spawners. 

I have another question, Is there any way to find the allocated memory and available free memory a server (Grid node) grammatically or by commands?.

JuanS_OCS
Amethyst | Level 16

Hi @yanamadala85,

 

a couple of comments:

 

  1. As mentioned, all of that is for your administrators
  2. Are you capable / authorised to run X commands?
yanamadala85
Obsidian | Level 7

I am not authorized to run the commands. I asked to know that if any thing available or not? .

 

Mainly I am looking programmatic way to know the find the inactive grid server and available Free memory of the server.

JuanS_OCS
Amethyst | Level 16

I don't think you can, without running X commands or running commands directly on your server's shell interface.

 

To run X commands is a high level risk, hence the reason because it is normally disabled. It is normally enabled only for few high level users or administrators.

 

All above make my advise to stand: please align with your SAS or system administrators. Open communication with them and ensure they understand your challenge, which is, in the end and probably, the challenge of your company business.

yanamadala85
Obsidian | Level 7

Thank you  I also felt the same. 

 

I have general question, Any way to submit the jobs only to particular grid node. We have four Grid nodes, I want to submit the rsubmit blocks to only one Grid node.

yanamadala85
Obsidian | Level 7
I have a question regarding looping of resubmit blocks.

I want to submit 30 programs parallel.

Which one will be the best way to do rsubmits.
A. Writting 30 individual rsubmits
B. Looping rsubmits

%macro loop;
Preprocessing like enabling sas sessions on grid and autosignon option and
etc;
Do i= 1 to n;
Rsubmit task&i wait=no;
Some sas program....;
End rsubmt;
Waitfor _all_;
Some sas program;
Signoff _all_;
%mend;
%loop;

Or

Rsubmit task1 wait=no;
Some program;
End rsbmit;
....

Rsubmit task30 wait=no;
Some program;
End rsbmit;
Patrick
Opal | Level 21

@yanamadala85

"I have general question, Any way to submit the jobs only to particular grid node."

Setting up workload balancing is a SAS Admin task. I don't know why you would only want to use a single node but yes, that can be done by defining a queue which only hits a single node. Setting this up is a SAS Admin task so discuss your requirement with this person at your site.

 

For execution of 30 jobs in parallel: First of all the LSF queue you're using must allow for 30 jobs in parallel else whatever you set-up won't execute as parallel as you believe. Secondly: If possible use LSF to define your flows and define parallelism and job dependencies there.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1216 views
  • 2 likes
  • 3 in conversation