Deploy a custom web application in the cloud for Data-Driven Content object in SAS Viya 4

9 Likes

In my previous article, I explained how you can interact with an image within a Data-Driven Content object in a SAS Visual Analytics report. In this article, I will start from the same html page, adapt the page to be rendered using Node.js server and ultimately deploy the application in the cloud. The steps outlined in this post use SAS Viya 4, however, similar steps can be applied for any web application deployed using Node.js in the cloud.

Before we dive into the different steps, I would like to thank my renowned colleague, Erwan Granger. Without his cloud deployment mentoring, this article would not have been possible.

Why use Node.js to deploy a web application?

While I'm not here to promote Node.js, I do want to outline some of the benefits of using it. Node.js is a JavaScript runtime built on Chrome's V8 JavaScript Engine. That means that when developing web applications you have only to know one language for the back-end and the front-end: JavaScript.

When you develop using Node.js, you can rely on a large number of packages and the expertise of thousands of developers. The fact that many packages are available means that you will most probably find the one that suits your needs. One of those packages is Express.js. I will use it to create a web server to render the html page developed in my previous article.

You might argue that to deploy a web application, there is no need for Node.js. You are right. Nginx or Apache web servers are also valid options as well as Django (a web server developed in Python). The biggest difference with Nginx or Apache web servers is the way you work with routing in your application. Where Nginx and Apache web servers rely on the folder structure to route the requests, Node.js and Django rely on the logic the developers write. As a result, Node.js offers more possibilities when developing complex applications as you, the developer, can create your own routing logic that doesn't rely on folder structure.

With that difference in mind, I also chose Node.js based on the way the applications are deployed. With Nodes.js, you can install an application by running "npm install" command. Why is it interesting? As you most probably know, CI/CD implies that developers update their code often and the code needs to be deployed as regularly and in an automated way. When using Node.js, you are creating a package.json which is used to install the application and the dependent modules. Those dependencies can be different for Production and Development environments. This makes it easy to reduce the number of packages needed in Production and it reduces the size of the deployed application. A side effect is also the amount of files stored in your git repository. Using .gitignore file you can easily exclude the node_modules folder (which contains the modules your application depends on).

By now, you have a better understanding about the Node.js choice. The remainder of this post will not explain how to program using Node.js and Express.js framework. There are plenty of resources on the web. The code for the application and the deployment is available on GitHub. I will nevertheless explain you how to convert the HTML page from my previous article to the one deployed in Node.js.

Repository content

In this section, I will go through the content of the Git repository. You can clone the GitHub repository to your machine or you can navigate on GitHub directly.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

The red boxes are related to the Node.js application while the blue boxes are related to the deployment.

The Node.js application contains:

libs folder contains the SAS provided JavaScript files to interact with Data-Driven Content objects. Those files are available under sassoftware repository on GitHub.
public folder contains the static files like stylesheets and images that are used by the application.
routes folder contains the functions to define the routing of the requests.
views folder contains the files rendered when accessing a specific route.
app.js file is the root file of the application. It is executed when starting the application.
config.js file contains the configuration information. In our case, it will define that the application listen for URL's like: http://my.domain.com/ddc. If the application is accessible from another URL, only the config.js needs to be updated.
package.json file contains the information about the modules and their versions as well as the start command for the application. It is used by the "npm install" command.

For the deployment, we have:

manifest folder which stores the script and the yaml file to deploy the application on Kubernetes.
Dockerfile which contains the parameters needed to build a Docker image of the application.

If you want to run the application to test it on your local machine, you should:

install Node.js
install git
start a command prompt
navigate to the folder where you want to download the application

clone the repository using the following commands:

git config --global http.sslVerify "false"
git clone https://github.com/xavierBizoux/ddc-container.git

navigate inside the repository
execute the following command:
```
npm start
```

If everything works fine, open your browser and navigate to this URL:

http://localhost:3000/ddc

You should see the following:

Deploying an HTML page to Node.js

In order to deploy the original HTML page into Node.js, a few modifications are required. Basically, when publishing an existing HTML page to Node.js using Express.js, you need register your HTML page as a view. As you have seen in the previous section, the views are stored in the views folder. In our application, we have two views: error.ejs and index.ejs. The HTML code from the previous article has been copied to index.ejs. The .ejs extension indicates this is a view which uses the Embedded JavaScript Templating. This is one of the templating engines that Express.js supports to render dynamic HTML content.

In our case, we should set all the paths defined in the HTML code as relative to the public folder. This folder contains the static content: images, stylesheets, JavaScript libraries and files.

To be clear, here are the lines that were updated:

Let's now check the content of public folder:

If you carefully looked at the content of the public folder, you probably noticed there is no javascripts folder. You might wonder how the web server will find the JavaScript files if there is no javascripts folder. This is where Node.js will do some "magic" for the application. In Node.js, most of the JavaScript libraries/frameworks like jQuery and Bootstrap are installed as modules. As a result they are stored in the node_modules folder after executing the "npm install" command. For security reasons, it is recommended to reduce exposure of the Node.js modules and therefore some specific routes are defined within the app.js file.

As you can see, routes are set for the JavaScript and css files that are stored in the node_modules folder but also in the libs folder.

If you want more information about the code, you can refer to Express.js documentation and the Express.js generator.

If you want to add another HTML page to the application, you should:

create an .ejs file in the views folder
add a new route to the index.js file located in routes folder
(re)start your application

To define a new route, you add code similar to this in the index.js file:

/* GET toto page. */
router.get('/toto', function(req, res, next) {
res.render('toto', { title:'Toto SAS DDC' });
});

After the application is (re)started using "npm start" command, the newly created view will be accessible on: http://localhost:3000/ddc/toto

Application deployment

As our objective is to use this application in combination with SAS Visual Analytics deployed on SAS Viya 4. The application should be deployed in the cloud using Kubernetes. Even though it is not mandatory to deploy the application in the cloud, I recommend to do so in order to reduce the CORS and CSRF configurations as the application will be deployed on the same domain as the Viya server and it will be easier for the Kubernetes administrator to manage the environment.

When the time comes to deploy the application, you have two options:

create a Docker image and deploy it
create a manifest to deploy your application

In this example, we will use the second option. If you discuss with a Kubernetes administrator, he might argue that the first option is better (or not) but here is the reason why we will not create a Docker image:

When you create a Docker image, you need to register the image into a Docker repository in order to deploy it using Kubernetes. By default, SAS Viya deployment doesn't require a repository. This means that you need to create a repository or upload your image to a shared repository which might not be ideal.
If your code is often updated, the Docker image needs to be recreated and downloaded from the repository to the Kubernetes cluster. Note that in our case, this would not be an issue because the image is small.

The benefit of using a manifest file is that:

You reuse the same base image. As a result, when you update your code, only the code is downloaded and copied to the container. The base image will be available to the Kubernetes cluster from the first creation.
There is no need to create a Docker repository nor to build the image and update it in the repository.

In our case, we are not building a Docker image but if you want to do it because the Kubernetes administrator forces you to do so, you can use the Dockerfile in the repository to build the image.

The easiest way to build the image is to use Visual Studio Code and Docker Desktop. You install both on your machine and add the Docker extension to Visual Studio Code as described in this Working with containers documentation. When the software is installed, you should:

open the git repository using Visual Studio Code
right click on the Dockerfile and select Build Image ...
using the Docker extension, you can then run the image

Working with a manifest file

Let's focus on the ddc_manifest.yaml located in /manifest folder of the repository. The file contains the information needed by the kubectl command to create a pod under a specific namespace in Kubernetes. In our case, the manifest will generate multiple resources: a deployment and a service. If you need more information about the different resource types, please refer to the Kubernetes documentation:

The Deployment resource type will be used to create the pod.
The Service resource type will define how the pod can be accessed.

Deployment

I will not explain the Deployment in detail but I want to bring your attention to specific points. Writing a yaml file for a Deployment resource can be complex and might require some expertise that is beyond the scope of this post.

In order to deploy our application, we need to define a container. In this case, the container will be based on node:10alpine image. This image is an official image from Node.js that is available on Docker Hub.

I think the name of the different properties is self-explanatory.

The command will be executed inside the container at startup.
The args is passed to the command.
The volumeMounts is used to mount a disk on the image.

The volume is defined as:

This means that we have defined a container with instructions on how to start it and a volume that will store some data but,originally, is created as an empty directory. This means that when the pod is created there is no data persisted from the previous pod execution.

If we use only the container and volumeMounts information, we have a container that will fail to start because our application has not been copied from the git repository. This is where the initContainers are used. It will do the preparation work. Basically, it will clone the git repository and make it available for the containers.

The initContainer will use another alpine based image which has git pre-installed. It will execute the git commands to clone the repository to the mounted volume.

A simplistic view of what happens when a pod is created:

Pod is created using the specifications provided.
In the pod, the initContainer starts and mounts the shared volume.
The initContainer clones the repository to the shared volume.
The initContainer stops as soon as the clone operation is completed.
The container starts and mounts the shared volume which now contains the git repository.
The container executes the commands to install and start the application.
The container waits for calls on port 3000.

Service

Thanks to this Deployment resource, we have created a pod. But at this stage, it is not possible to communicate with the pod within the Kubernetes cluster. We need therefore to define a Service.

In this example, the service will listen on port 3000 and will forward the requests to port 3000 in the pod.

Ingress

Executing the following command will create and start the pod using the information we have provided. In this example, the namespace is named big. It matches the namespace where SAS Viya is deployed.

kubectl -n big apply -f https://raw.githubusercontent.com/xavierBizoux/ddc-container/master/manifest/ddc_manifest.yaml

At this stage, we have a running pod that listen on a specific IP address on port 3000. But what happens when we recreate the pod or spin another pod because our application needs to scale-up. As you can imagine accessing a specific IP address is not a good practice especially as it might change over time. This is where Ingress plays a role. Ingress routes the requests from a specific URL to the available resources. Usually, the Ingress definition is similar to this:

As you can see, it is routing the requests from /ddc on big.myDomain.com to ddc-service on port 3000. The ddc-service is the Service we have defined earlier.

If you know upfront the hostname of the SAS Viya environment, you can add the Ingress definition at the bottom of ddc_manifest.yaml and execute the kubectl command mentioned above.

If you are running on a test environment where the URL might change. You can use the quick_deploy.sh file located in the manifest folder of the repository. That file is designed to create the Ingress resource on the fly using the hostname of the machine running the kubectl commands. It will execute the commands within the big namespace. If you use another namespace, please update the file to reflect the actual namespace.

Usage

So far, we have created a Node.js application, deployed it within a container running on a Kubernetes cluster. It is now time to use that application within SAS Visual Analytics. Our SAS Viya 4 environment runs on:

http://big.rext03-0076.race.sas.com/SASVisualAnalytics

and our web application can be accessed on:

http://big.rext03-0076.race.sas.com/ddc

If you follow the instructions to create the report, you should end up with a report similar to this:

Conclusion

As soon as you have a web application that can be used within SAS Visual Analytics, you can deploy it in the cloud using a manifest file that meets your needs. You can deploy the application within a Node.js server or you can use the web server of your choice.

Integrating your application in a CI/CD process can also be done at ease as soon as you have the manifest file or a script that chains the different actions. If you want to integrate with Jenkins, it can also be done as Jenkins will detect changes on a git repository for you. It's up to you to decide how far you want to go with automation.

In terms of developments, you can also build much more complex applications that will for example retrieve data from another source through API's and merge it with CAS data from the Visual Analytics report or call a model published to MAS after making selections within your VA report. If your application becomes more complex, you can separate it from the SAS Viya namespace but still make it accessible though the same URL using Ingress.

As you can see, you have plenty of possibilities from an administration as well as from an application development point of view. SAS Viya is what you make of it!

SAS Communities Library