BookmarkSubscribeRSS Feed

How the Cluster Manager makes SAS Event Stream Processing massively parallel and elastic

Started ‎12-05-2016 by
Modified ‎08-09-2018 by
Views 6,583

Editor's Note: Adapter Manager is now known as Cluster Manager. The text below has been changed, but the video still references Adapter Manager.

 

===================================================================================================

 

SAS Event Stream Processing version 4.2 brings a lot of new features. Among them is the Cluster Manager (formerly known as Adapter Manager.) The ESP Cluster Manager is a way to process streaming events on a grid of ESP engine instances and make ESP massively parallel. In addition, the Cluster Manager provides elasticity which is very interesting when deploying ESP in the cloud. Admins can then decide to scale ESP dynamically across multiple hosts, adding and removing ESP engines as demand increases and decreases.


The purpose of ESP Cluster Manager

Let’s have a look to the following video to understand what the goal of the Cluster Manager is.

 

 

 

Configuration of a Cluster Manager

An ESP Cluster Manager is defined via an XML file. In the XML file, you have to define several things:

  • The ESP project that you want to deploy on the grid of ESP engine instances
  • The data sources (publishers) that you want to receive events from
    • They are equivalent to “connectors” but are called “raw-sources” in the Cluster Manager context
    • “Raw-sources” share the same properties as “connectors”
    • No publisher connector should be defined in the ESP project itself. The Cluster Manager fulfills the role of connecting to the data sources and publishing events to the ESP project source windows
    • Currently, only “connectors” aka “raw-sources” can be managed by the Cluster Manager, no “adapter” is supported but “adapters” are being considered for the next release. (Editor's note: See 
  • The route (in fact the Cluster Manager relies on a new ESP feature as well: the router) between the data sources (raw-sources) and the ESP project source windows
  • The routing policy, either multicast, round-robin or hash (discussed later)
  • Optionally, the orchestration of the data sources, comparable to the ESP connector orchestration, if you need to run some “raw-sources” prior to others
  • The ESP cluster which is a collection of running instances

Here is an example of a Cluster Manager configuration:

NRoriginal1.png

 

Routing policies

Regarding the routing policy, we have 3 different choices:

  • Multicast policy sends every event to all the engine instances
  • Round Robin policy sends events to engine instances in a round-robin fashion
  • Hash policy hashes the value of some pre-defined fields and uses that value to decide where to send the event

The Hash policy requires you to define fields that are used in the hashing mechanism.

 

<hash-destination name='dest1' opcode='insert'>
   <publish-target>
      <project-func>PtradesAM</project-func>
      <contquery-func>trades</contquery-func>
      <window-func>TradesSource</window-func>
   </publish-target>
   <fields>
      <field name='venue'/>
   </fields>
</hash-destination>

 

“Venue” is the field to be hashed. The hash value is an integer between 0 and the number of engine instances minus one. The router uses the hash value to determine which engine instance the event is sent to. Value hashing is useful to ensure that related events are processed together. For example, in the broker surveillance model, you need all events relating to a specific broker sent to the same engine to accurately detect broker malpractice like front running.

 

 

How does it work behind the scenes?

Watch the following video to learn how the Cluster Manager works when it starts.

 

   

A few commands to know

The Cluster Manager has its own executable called dfesp_am_server. To start a Cluster Manager with its corresponding configuration file, run the following command:

 

$DFESP_HOME/bin/dfesp_am_server -pubsub 5565 -http-admin 5566 -cluster-manager file://ClusterManager.xml

 

No HTTP pubsub port is required. Note that the Cluster Manager will NOT automatically start ESP XML Factory servers that are defined in the Cluster Manager configuration file (esp-cluster). The ESP XML Factory servers must be provisioned and started before running the Cluster Manager. In order to dynamically add a running ESP XML Factory server to the control of an Cluster Manager, the following command needs to be executed:

 

curl -X PUT -d "<esp-engine name='srv02esp3' host='sasserver02' port='5595' ha_port='5596'/>" http://sasserver01:5566/SASESP/routerEngines/espmap1/srv02esp3

 

curl is used here (instead of dfesp_xml_client) because we can provide the request body via standard input (stdin). The request body contains the definition of the new ESP XML Factory server to add to the Cluster Manager’s control. This ESP instance must be running first. The REST API URL means that we want to add an engine called “srv02esp3” (must be aligned with the request body) to the router called “espmap1” (the router takes the name of the esp-map we defined in the Cluster Manager’s configuration file). If we no longer need to publish events to a running ESP XML Factory server, then we can remove it from the Cluster Manager’s control, by running this command:

 

$DFESP_HOME/bin/dfesp_xml_client -url "http://sasserver01:5566/SASESP/routerEngines/espmap1/srv02esp3" -delete

 

Want to see it in action?

Watch this video to view a demo of the ESP Cluster Manager.

 

 

Thanks for reading.

Version history
Last update:
‎08-09-2018 04:43 PM
Updated by:

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags