Hello Experts,
Reaching out to you for your help and insights on Cloud based SAS Grid.
For SAS 9.4 Grid environment, if there is a below requirement:
- Scheduling and automating scaling of SAS Grid nodes.
- Implications of doing so on daily basis.
- What are the key pointers that one should ensure on both Infra and SAS Grid application front.
- Any known issues or precautions you would have taken in your environment.
Regards,
Ankit.
There are a couple ways of doing this in SAS Workload Orchestrator both using time-based configuration. You can change the queue configuration based on time to change the list of hosts that can be used for a certain period of time. You can change the host type configuration based on time to set the maximum number of jobs on a host type to 0 during a certain period of time. Create one host type for hosts you want to keep open and one host type for hosts that you want to close. SAS Workload Orchestrator has no facility to kill jobs that are still running outside of a time window.
If you are using LSF, you could change the queue configuration to eliminate hosts for the period of time you want. LSF also has DISPATCH_WINDOWs and RUN_WINDOWs on queues and DISPATCH_WINDOWs on hosts that regulate when jobs can be scheduled (DISPATCH_WINDOW) or run (RUN_WINDOW).
Hello @ankitd ,
there are many points to consider, but there are some that seem the most interesting to me:
Does it help? or would you like to stop in certain questions?
@JuanS_OCS @doug_sas thanks for your insights, I understand that the setup and co-location along with network, storage etc. are key drivers.
However, what I am trying to understand is, once we have setup the environment and migrated artefacts to the cloud with necessary validation and testing completed successfully, who can I automate the scaling of Grid nodes if I wish to achieve that on daily basis.
Example: Grid environment has 8 nodes, of these I would like to have all the 8 available for the overnight batch processing however scale it down to 4 nodes post overnight batch run (i.e. during business hours).
Given that it would need to be done on daily basis hence would need to have an automation in place on Infra, Storage and Grid application.
Given the above scenario, I am looking on the insights requested earlier:
- Scheduling and automating scaling of SAS Grid nodes.
- Implications of doing so on daily basis.
- What are the key pointers that one should ensure on both Infra and SAS Grid application front.
- Any known issues or precautions you would have taken in your environment.
Regards,
Ankit.
@MargaretC , when I mean automation, this post the environment setup.
Automating the closure of Grid nodes between a specific time so that they can then be moved in a STOPPED state via the Infra tool.
I am aware that achieving a scheduled stop-start of nodes on Infra level is something that needs to be taken care separately.
However any insights on what precautions need to be taken on both Infra and storage level would be helpful.
Coming back to main ask, is how to achieve closure of Grid nodes on the application level before stopping the Infra and how to start the application back once the nodes are available is what I am looking for.
Any real time implementation examples and insights would help..
Regards,
Ankit
@MargaretC yes I m looking for those insights. If you can please help while covering the points I have mentioned during my initial post.
Thanks in advance.
There are a couple ways of doing this in SAS Workload Orchestrator both using time-based configuration. You can change the queue configuration based on time to change the list of hosts that can be used for a certain period of time. You can change the host type configuration based on time to set the maximum number of jobs on a host type to 0 during a certain period of time. Create one host type for hosts you want to keep open and one host type for hosts that you want to close. SAS Workload Orchestrator has no facility to kill jobs that are still running outside of a time window.
If you are using LSF, you could change the queue configuration to eliminate hosts for the period of time you want. LSF also has DISPATCH_WINDOWs and RUN_WINDOWs on queues and DISPATCH_WINDOWs on hosts that regulate when jobs can be scheduled (DISPATCH_WINDOW) or run (RUN_WINDOW).
In addition to what @JuanS_OCS said, LSF has in it 'resource connectors' that allow LSF to start up instances on various providers. SAS currently supports AWS instances, but the latest LSF has support for others.
You also need a data storage strategy. For example, will it be
Each has good & bad attributes (cost, speed, access). Also, if all instances went away, where would the data be stored for when the instances come back up?
Just some things to think about.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.