Good to see SAS now has its own fully-fledged workload and scheduling tool for grid computing.
What I would like to know is this tool also available in non-grid SAS installations of SAS 9.4M6 where its scheduling capabilities appear to be a complete replacement for LSF? In my view this would be a major improvement as it would remove the requirement to install Platform Process Manager separately.
You can buy SAS Job Flow Scheduler to do single machine scheduling just like you could have bought Platform Suite for SAS to do the same in the past.
You can buy SAS Job Flow Scheduler to do single machine scheduling just like you could have bought Platform Suite for SAS to do the same in the past.
And it's not included in any of the Data Integration/Data Management Server offerings? (has to be ordered sperately)
At my current site (M4), we have the SAS Scheduling Server installed. My guess it came (in our case) with the Risk & Financial Workbench.
And I would be careful calling it "fully fledged" until I got to see the full feature list explained.
@LinusH - that thought had occurred to me too. What would be good is a feature list comparison between the two products.
I can give a feature comparison for SAS Workload Orchestrator vs LSF vs YARN (from my upcoming SGF 2019 presentation):
Feature |
SAS Workload Orchestrator |
Platform LSF |
Hadoop YARN |
Preemptable priority queue-based scheduling |
Yes |
Yes |
Yes |
Fairshare based scheduling |
No |
Yes |
Yes |
Host selection by sorting on dynamic resources values |
Yes |
Yes |
No |
Host selection by static accounting |
Yes |
No |
Yes |
Job preemption by suspension |
Yes |
Yes |
No |
Pluggable scheduling logic |
No |
No |
Yes |
Time-based queue configuration |
Yes |
Yes |
No |
Time-based host configuration |
Yes |
Yes |
No |
Time-based hostgroup configuration |
No |
Yes |
No |
Time-based usergroup configuration |
No |
Yes |
No |
Time based configuration time definition |
Cron Expression |
Day of week, time of day |
N/A |
Scheduling based on dynamic resources |
Yes |
Yes |
No |
User-defined resources |
Yes |
Yes |
No |
Scheduling thresholds based on resources |
Yes |
Yes |
No |
Suspension thresholds based on resources |
Yes |
Yes |
No |
Scheduling dispatch windows |
Yes |
Yes |
No |
Queue active job limits |
Yes |
Yes |
No |
Terminate jobs when limit exceeded |
Yes |
Yes |
No |
Job request specifies minimum resource requirements |
Yes |
Yes |
No |
Job request specifies consumable resource requirements |
Yes |
Yes |
Yes |
Job request specifies Boolean resources |
Yes |
Yes |
Yes |
Job owner |
Authenticated user on job submission |
Process owner of job submission |
Authenticated user on job submission |
Authentication |
Username/password, Kerberos |
Process owner |
Kerberos |
Data-in-motion security |
SSL |
Proprietary |
SSL |
Data-at-Rest security |
File based permissions. Sensitive data encrypted using AES128 with key derived from site-defined password (SHA256, 10000 iterations) |
File based permissions. Sensitive data encrypted with internally defined key, AES128 |
File-based permissions. |
End-to-end Kerberos to run jobs |
Yes |
No |
No |
Type of jobs that can be run |
Any |
SAS only |
Any |
Embedded GUI |
Yes |
No |
Yes |
Dynamic configuration |
Yes |
No |
No |
REST-API Based |
Yes |
No |
Yes |
Configuration files |
One |
Multiple |
Multiple |
Ability to start/stop VMs as needed |
No |
Yes |
No |
Support parallel jobs |
No |
Yes |
Yes |
Ability to change job information before it gets into the queue |
Yes |
Yes |
No |
Supported Operating Systems |
Windows X64, Linux X64 |
All SAS v9 Windows and UNIX server platforms |
Linux X64 |
Hi @doug_sas , many thanks for sharing, very useful.
While there might some items that are not fully clear to me, it catched my especial attention the fact that "Support parallel jobs" is not available.... Would not this be a problem for SAS Grid Manager deployments? How is this currently worked around?
SAS Grid Manager creates single jobs. Parallel SAS programming requires multiple grid-enabled SAS/CONNECT SIGNONs which each are a single job.
Parallel jobs would be to support a job that uses something like MPI across multiple machines. For that we have Viya's Cloud Analytic Services.
Hi @doug_sas , thank you.
Makes sense, but i don't feel as the question is answered. Please bear with me and let me rephrase it:
How are the current SAS Grid Manager users (of SAS Grid Manager for Platform, on M6, or SAS Grid Manager, prior to M6), going to manage now the that lack of capability for "Support parallel jobs" which was present before? Between all the improvements, this fact seems to me as a big point of attention.
I mean, and please correct me if I am wrong ... it is basically the main feature of SAS Grid Manager so far. But not it won't be there. The current SAS Grid Manager environment will require SAS Viya in order to use this functionality. Is this understanding correct?
The 'support parallel jobs' feature listed in the table means the ability to submit to the workload manager a single job request that results in multiple processes being started on multiple machines. This would be used, for example, to start an MPI job on the grid that spans multiple machines.
SAS Grid Manager has never used parallel jobs or something like MPI to do its work. Parallelism in SAS code is done via SAS/CONNECT. SAS code would start multiple CONNECT servers using grid-enabled SIGNON statements which results in a single grid job for each server process. The SAS code would then submit SAS code to each server using the RSUBMIT/ENDRSUBMIT statements. Assuming the RSUBMITs set the option to process the remote code asynchronously, all the code on all servers execute in parallel. You still get the parallelism, but it is not done via a single job request - it is done through multiple job requests.
So 'parallel jobs' are not needed to be able to run parallel SAS code. Actually SAS/CONNECT allows you to run parallel SAS code without needing SAS Grid Manager.
@doug_sas - Great feature summary.
What is of most interest is how SGM integrates with SAS versus how LSF does it. For example it is annoying when in SAS Management Console you have to log in a second time to do any scheduling. Also in SMC you can set up and/or run a scheduled job but you have to switch to Platform Process Manager to get the status of these jobs (running or not, successful or not). Having all scheduling functionality in one interface would be a big improvement.
@SASKiwi wrote:Good to see SAS now has its own fully-fledged workload and scheduling tool for grid computing.
https://blogs.sas.com/content/sgf/2019/01/22/native-scheduler-new-types-of-workloads-and-more-introducin...
What I would like to know is this tool also available in non-grid SAS installations of SAS 9.4M6 where its scheduling capabilities appear to be a complete replacement for LSF? In my view this would be a major improvement as it would remove the requirement to install Platform Process Manager separately.
Yeah I was searching for this information as well. I am not a professional but I've had a feeling that getting rid of PPM may increase performance.
You can buy SAS Job Flow Scheduler as a single machine scheduler.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.