SAS ESP on Kubernetes – What does it change for the developer?

1 Like

SAS Event Stream Processing 2020 and later is now cloud-native and works in a Kubernetes environment like the rest of SAS Viya. While this type of major deployment change should be transparent to the end users, there are a few things that an ESP project developer needs to know to fully leverage this new environment. Let’s review some of them in this post.

1 – No more ESP “factory” server needed

Previously, in SAS ESP 6.2 and before, at least one ESP server must have been started initially for developers to be able to design an ESP project in good conditions, discover connectors parameters dynamically, query ASTORE files inputs and outputs and so on. Also, the same server was used to test the project as well.

Users’ ability to start their own ESP server on a linux server was probably a hurdle at some customer sites. So, some of them who didn’t want users to directly connect to the linux machines, may have implemented ESP servers as daemons to provision multiple ESP servers, shared between users. While it does the job, it probably offers less flexibility in terms of configuration options for those ESP servers (logging, python context, etc.).

With ESP 2020 on Kubernetes, an ESP developer doesn’t need to think about starting an ESP server for designing a project. He just has to connect to the ESP cluster provided by default (named according to the Kubernetes namespace).

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Behind the scenes, there’s a continuously running ESP server available in a specific pod (named sas-event-stream-processing-client-config-server if you are curious) that handles all the tasks the “factory” server was previously doing: discover connectors’ properties, online analytical algorithms and ASTORE’s parameters, input and output variables.

When it comes to the testing of the project, the ESP developer will observe a new behavior. When the user triggers a test, a brand-new ESP server is started on demand in a new Kubernetes pod. Thus, the ESP server used to design the project and the ESP server used to test are 2 separate servers. If a user tests 5 ESP projects at the same time, ESP will launch 5 additional Kubernetes pods each running a single ESP server.

2 – Customize the ESP server

Since developers might have less control about the way ESP servers are started in Kubernetes, new customization capabilities have been added to the User Interface in both ESP Studio and Event Stream Manager so that users can specify ESP options or environment variables in their settings before testing or running a project. Here is how it looks like:

3 – One project per ESP Server

With ESP 2020 on Kubernetes comes a new execution model: an ESP server runs 1 and only 1 ESP project. Whether you are testing an ESP project or you are running an ESP project from ESP Studio or from Event Stream Manager (ESM), a new Kubernetes pod is instantiated with a single ESP server that will execute this ESP project. ESP engines are no longer used.

4 – Access to physical resources

That’s one of the big impacts of Kubernetes. Accessing files in a physical directory on the host is no longer “easy”. All the paths you see inside the ESP environment are virtual paths. They don’t exist outside the cluster. So, you’ll need to work with your Kubernetes administrator to setup Kubernetes mechanisms like Persistent Volumes (PV) and Persistent Volumes Claims (PVC) to make files visible to the ESP pods and containers.

Why would you want to access files outside the Kubernetes cluster?

You might want to access static data residing in flat files (although this should be rare, ESP working mostly with dynamic data sources like message buses)
You might want to use ASTORE files in your projects (thus, you need to make the ASTORE files available to ESP servers’ pods)
You might want to use the QKB to perform some data quality processing in real time (thus, you need to make the QKB files available to ESP servers’ pods)
You might need to access some additional libraries required by some adapters/connectors
You might want a specific version of Python
Etc.

5 – Adapters usage

ESP adapters are no longer available outside of Kubernetes. Most of them have a corresponding connector. Thus, it is recommended to use the connector.

For adapters that don’t have a corresponding connector, you will have to use the Adapter Connector which allows you to call an adapter from a connector configuration.

6 – REST API usage

It is still possible to interact with an ESP server using the REST API. However, finding the right endpoint is trickier than before.

Before, you had an ESP server listening on a machine on a HTTP port (5702 for example). The endpoint was independent of the projects running in this ESP server and looked like:

https://myesp.hostname.com:5702/eventStreamProcessing/v1/

Now, with Kubernetes, you need the project name to build the endpoint. Indeed, the project name is used as the Kubernetes service name for that particular ESP server:

https://sasesp.myesp.hostname.com/SASEventStreamProcessingServer/myproject/eventStreamProcessing/v1/

Nothing insurmountable. Except when you start to use ESP project names that are not compliant with DNS subdomains: names including capital letters, underscores, etc. In this case, the project name will be hashed to build a compliant service name. You can find this service name by going in the ESP Studio User Interface and discover that “host” name:

Example:

Project Name: my_IoT_Project

Service Name: my-5f-49o-54-5f-50roject

Endpoint:

https://sasesp.myesp.hostname.com/SASEventStreamProcessingServer/my-5f-49o-54-5f-50roject/eventStreamProcessing/v1/

7 – ESPPy usage

Similarly to REST API, using Python and ESPPy to communicate with ESP on Kubernetes is slightly different. In order to connect to an existing ESP server or to start dynamically a new ESP server in Kubernetes, you will need an additional object: a kubectl proxy that will be used as an interface between Python and ESP. Once you have it, the connection to ESP will look like this (assuming the kubectl proxy is listening on a given machine on port 8001):

This Python statement will connect to an existing “myproject” server or will create it on the fly.

8 – Viya or non-Viya deployment

ESP can be deployed in 2 ways:

with Viya
without Viya (also known as the “lightweight” version of the product)

When deployed without Viya, 2 main capabilities are not available:

Versioning of ESP projects in ESP Studio
SAS Model Manager integration with ESP

9 – Trace

While it was probably already an issue before when ESP developers didn’t have access to the linux boxes to get the full ESP log, it will be even more difficult for them to get access to the ESP logs residing in containers in pods in the Kubernetes cluster.

Good news with the recent versions of ESP Studio, the ESP developer can now access the ESP server log in ESP Studio.

10 – SAS Event Stream Processing Streamviewer

Like in ESP Studio, a user doesn’t need to add ESP servers explicitly when they are instantiated in the Kubernetes cluster. ESP Streamviewer auto-discovers ESP servers.

11 – No more ESP client

In previous versions of ESP, developers might have used some of the ESP utilities provided out-of-the-box like the following ones:

dfesp_xml_client: communicate with the ESP server over HTTP and REST
dfesp_analytics: discover analytical algorithms’ parameters, inputs and outputs
dfesp_xml_validate: validate the XML of a project
dfesp_xml_migrate: migrate the XML of a project to the current version

While dfesp_xml_client can be easily replaced by the curl utility and dfesp_analytics role is handled by ESP Studio dynamically in the UI, the other 2 ones have no equivalent in the new SAS Viya.

12 – Scaling out ESP servers

One cool thing about Kubernetes is that it provides built-in mechanisms for autoscaling. ESP leverages them when you run ESP projects with Event Stream Manager. You can specify how many replicas are started initially and the maximum number of replicas that the project can use when it reaches some thresholds in terms of CPU usage or memory consumption. This works well with some types of ESP projects that do not have inter-dependencies.

Here is how you set this up in Event Stream Manager:

Conclusion

With ESP on Kubernetes, there are a few adjustments that a user should be aware of to fully take advantage of this new architecture. This article briefly highlights some of them. If you think of some other challenges, feel free to comment.

Thanks for reading.