You can have your container, and eat it too!

6 Likes

In your day job, how often do you to have to type the following words:

docker run <something>

I’m a relative newbie when it comes to Docker but I think that even as a newbie, there are some lessons to be shared. There are aspects of containers that are so important, and yet not so difficult to grasp, that I hope this post about SAS Viya and Containers will be useful to you, and allow me to share some of my early thoughts and insights.

Before posting this here, I did share this content with some of my colleagues who are much more knowledgeable about containers than I am. And aside from rolling their eyes at my analogies and over-simplifications, their feedback was positive.

So, here are 5 simple things you should understand about containers.

Containers are not magic
Containers are not just a delivery model
Containers, like cakes, have layers
Containers, like cakes, can be homemade or store-bought
Containers, unlike cakes, can be cloned quickly and infinitely

I will develop these points and the ways they relate to Viya.

Containers are not magic

I know, everyone likes magic. But behind a well-executed trick, there is often a lot work, preparation, and a little sleight of hand.

In fact, while attending a beginner class on Docker and Kubernetes, I was surprised to learn that containers are not new, and that Docker essentially found a way to pull multiple technologies together into an elegant solution that has now become the standard way of doing containers.
But, if your application is bad, putting it inside a container will not make it better.
If your app does not scale, or if your app is stateful, putting it inside a container, although possible, might just make it behave even worse.
Further to this, you will still be bound by the limits of physics: If the Server your container runs on has terrible IO performance, the containers that are running on it will not magically have amazing IO performance.
Finally, if you are going to be using Containers in exactly the same way that you started using virtual machines fifteen years ago, there will be very little advantage, and possibly a lot of downsides.
So, before you over-promise on what a container can do, check in with the experts. (Not me. A real container expert.)

Containers are not just a Delivery Model

We, at SAS, often use the words “delivery model” or “deployment model". At Viya 3.3 we had one major delivery/deployment model, which was we affectionately refer to as "BareOS". That is to say, that if you had the right Operating System (Linux, at that time), Viya 3.3 could be installed. The implications being that you did not require any specific Infrastructure, but simply a Linux Server, physical or virtual, on-prem or on-cloud.

In September 2018, Viya 3.4 started supporting Windows and Containers, as “deployment model”.
This sounds great at first glance: who wouldn’t want more choices in the way they the software gets delivered? But what those simplistic statements fail to convey is that not all “delivery models” are created equal, and don’t necessarily leave you with the same software behavior or capabilities.

For example, although SAS supports running Viya on Windows as of September 2018, that is not true across the board for all SAS Viya offerings (e.g. SAS Visual Investigator is not yet supported on Windows). Furthermore, on Windows, only single-machine deployments are supported. (Not just CAS SMP, but the entire Viya platform on has to fit on a single server).

So, although the answer to “Can Viya run on containers?” has changed from a no to a yes, this is a qualified yes. It’s a “Yes, but, it’s a single container image, and it’s a Programming-Only container, meaning that it’s only able to run SAS Studio, MVA SAS, and CAS, with possibly some Access Engines”.
If someone simply says "Viya runs on Containers", they are not giving you the full picture and you should check that your specific requirements will be met by the container.

The reason I stress this point is because many might hear “delivery model” and think about something like Uber Eats. If you order a cake, you don’t care how it’s delivered, whether by car, foot, or bike. You only care about the time it takes for it to arrive, and that the cake gets there intact. But, you must realize that if you order a ten feet tall wedding cake, you might want to use the "truck" delivery model, rather than the "skateboard" delivery model. If the delivery model is to influence what kind of cake you can get, its flavor and how big it is, you will be a lot more careful about how it does get delivered.

So, let me repeat it here again: The first Viya Container is a single-image, programming-only container. It will not be a full-fledged Viya deployment. And although you will be able to scale out, by running multiple instance of this Viya container, each one will work by default as an independent unit, which has both pros and cons. With that said, you can also bet on the fact that the offerings on containers will keep evolving, improving and expanding with time.

Containers, like cake, have layers

The way that docker containers are built is by layering some changes. In a way, like some cakes.
You’d start with a cake pan, obviously to contain the cake, and then add some layers. Batter, sugar, frosting, etc.. (Can you tell I don’t bake often?)

As of the 18th of September 2018, any SAS customer who licenses Viya 3.4, with their SAS Software Order, can download (aka “docker image pull”) and run ("docker container run") the Viya Programming-Only ~~cake~~ container image. (Beware, if you got your Viya 3.4 Order before that date, this won't work for you!).

To the average user, well, it looks simply like a cake. You don’t see the layers of a finished cake unless you cut it open.
But, and this is where it gets interesting, if you need the cake to do a bit more, you can add more layers, on the top.
You could, for example, add your own layer to the Viya Container-Cake, to overlay a Jupyter notebook on top. Now, although we can test and control the taste and look of the cake when it leaves SAS’s kitchen, there is no telling what adding more layers on top of that might do. There is also no way to test in advance all the possible combination of layers that could be added to a cake.

Containers, like cake, can be homemade or store-bought

There is another aspect of containers that probably does not get enough publicity. If you come up with a new cake recipe, and this cake contains proprietary ingredients that only you sell, you have a couple options:
You can simply manufacture the cake, and sell it in stores.
Or, you could make the recipe available to the public. Then, if they get the ingredients, and follow the recipe, they should end up with the right-looking cake.
This, however, is a potentially less reliable way of making that very same cake. For example, if you try to use a type of cake pan that is incompatible with the recipe, you might never get to the right cake.
If you keep reading my posts and my container-related content, you’ll probably hear me differentiate between the “container recipe” and the “pre-baked container”.
Keep in mind that they are slightly different beasts, and that their level of supportability will likely vary. For example, if you change the color of the tray in which you bake the cake, that's probably fine. But if you want to use a plastic container instead of a metal one, that probably will never work.

Containers, unlike cakes, can be cloned

Every analogy must stop somewhere, and I’ve stretched mine to that limit. (In fairness, this is just a shortcoming on the part of the cake industry, in my opinion). Once you have obtained (or built) your cake (a Docker image), you can essentially clone it as many times as you want.

When you “run an instance of a container”, you essentially start a clone of your container image. That running container instance will consume computing resources: CPU, Memory, IO, Network, etc..
So a container image , is really only using some space on a storage device.

I found the following on this website. I am paraphrasing:

A container image is an inert, immutable, file that is essentially a snapshot of a container
A running container instance is the result of running a specific container image

This is an important aspect of containers: they are meant to be replicated.
This will in turn influence how you choose to use it.
Assuming that you have a Viya Programming-Only Container image, do you want to run one container for each user, or do you want to have a single running container, shared by all your users?
There are pros and cons to each method, that are beyond the scope of this post.

Asking the right questions

If you want to have something called "Viya on Docker", you need to ask yourself what the driving force behind this is.

Is it because you are “moving to the cloud”?
- You can run Viya on IaaS very easily and successfully without using containers.
Is it because you want Viya to scale infinitely and automatically?
- If so, be aware that it is not the current state of Viya, whether you are using containers or BareOS for that matter.
Is it because you have a standard that everything must fit inside a container?
- There is a difference between do-able and advisable.
- Viya has many Stateful components, and fitting those into containers requires a lot of care.

And I’m sure there are plenty of other reasons out there.

It is up to you to learn more about the Viya container, and make sure that it is a good fit for your purposes.

If you want to start learning more about SAS and Containers, this page should help getting you started.

JuanS_OCS · ‎11-02-2018

Hello @ErwanGranger,

thank you very much. I do like, not only your content, but also the way you present the content.

Let me join you with a few thoughts and some questions. Please bear with me.

SAS Viya, per-se, does require many resources, and it can provide multi-tenancy and MPP with a single environment.

Hence, this raises my first questions:

what is the advantage of containers against the CAS MPP and multi tenancy?
would not generate too much overload on the hardware, only by the Viya services (not the actual usage), especially against multi-tenant environments?
is there any limitation, by the capacity of the micro-services, on the number of tenants that can be handled at this moment by SAS Viya?

In addition to this, at this moment I can only think at this moment that this option could be interesting mostly for quick deployments in the cloud (or private clouds), but in any case, only environments that are actually quickly (and largely) escalable. I am hoping you can actually argue this initial guess. I am wondering how (SAS Viya) containers could help small and mid-size business as well (or large business were scalability is not yet one of its strengths).

Thank you in advance!

Kind regards,

Juan

ErwanGranger · ‎11-02-2018

Hello @JuanS_OCS, and thanks for the comment. The ink on that post barely had time to dry. 🙂

First and foremost: As I mentioned in the post, currently, the container offering for Viya are only for Programming-Only. Meaning that there is no Microservice involved. So, that means that Multi-Tenancy in the sense that you are thinking of, is not an option at this time.

Meaning also that it is very lightweight in its baseline memory usage. Don't quote me on that, but in my default container setup, starting an instance of the container only requires around 500 MB of memory. (Keep in mind that this memory usage will grow as soon as data gets added to CAS).

When a full Viya deployment option becomes available on containers, many of your questions will be easier to answer.

For example, it's likely that at that point I would be able to answer that "the baseline memory footprint for a full Viya deployment is roughly the same, whether you use containers or not". Or maybe I'll be able to say that "Multi-tenancy is an option, regardless of whether you are using containers or not". But for now, because the Container offering is so different for the full Viya offering, it is too early to have those discussions.

As it stands, currently, the non-container offering (called BareOS), is the most feature full: You can have SMP CAS or MPP CAS, you can have Single or Multi-Tenancy, you have access to all the Visual Web Applications, you can integrate with various Authentication providers, etc... For large businesses, this is likely to be still the most appealing option.

On the flip side, with the current Viya Programming-Only container, the deployment is much simpler (docker pull, docker run) if you are already familiar with docker. If your data is not too big, and if you know how to code in SAS or in CAS, the container option can be appealing. You can also run the container in batch mode, which opens up interesting scalability possibilities.

I hope this helps a little. And I have to warn you that many of my answers will likely need to be tweaked in the coming months. So stay tuned!

JuanS_OCS · ‎11-04-2018

Thank you very much! I will definetely stay tuned, very interested on this topic!