BookmarkSubscribeRSS Feed

SAS Viya 4: Loading External Path-Based Data

Started ‎02-24-2021 by
Modified ‎02-25-2021 by
Views 6,888

You might have heard that the new SAS Viya (2020.1 and later) will deploy solely on containers orchestrated by Kubernetes. While Kubernetes introduces many benefits, it also introduces many changes, especially regarding storage. In particular, the containers that Kubernetes orchestrates are virtual with paths that map to virtual locations. So what happens when you have a source data admin wanting to pass data into containerized Viya? How do you answer the question, "Where should I drop the data?" All the paths you see inside the Viya env are virtual paths. They don't exist outside the cluster. How do you find out where they map on the physical hardware?

.

Kubernetes Pod to Physical Path Mapping Mechanisms

While all container technologies offer storage mapping mechanisms (e.g. Docker bind mounts and Docker volumes), these should be avoided because they are tied to individual containers and live only as long as the container does. Kubernetes offers two main (there is always another way to do something right?) mechanisms for mapping internal paths to external storage resources -- Utilizing Persistent Volumes with Persistent Volume claims as well as defining volumes directly in a pod. We'll cover both mechanisms here.

.

Using Persistent Volumes with Persistent Volume Claims

Kubernetes Persistent Volumes map the storage paths inside the containers to the storage resources available to the physical hosts. As the image below shows, this mapping is done via three levels of abstraction:

 

  1. A persistent volume (PV) makes a physical storage location, NFS, AWS EBS, Azure File, etc., available to the cluster. .
  2. A persistent volume claim (PVC) requests access to the PV inside a cluster namespace .
  3. A volumeMount maps the PVC to a location within the Kubernetes pod.

PV -> PVC -> volumeMountPV -> PVC -> volumeMount

 

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

 

Defining Persistent Volumes

Persistent volumes are defined in Kubernetes YAML files and are instantiated using the kubectl command line tool. An example of an NFS PV is shown below. Note that the file basically maps the name, nfs-pv to an nfs mount on the server, 192.168.1.40, at the path, /opt/k8s-pods/data. The claimRef reserves the PV for a "claim" that we'll make in the next step.

 

Sample PV DefinitionSample PV Definition

 

 

Defining Persistent Volume Claims

Persistent volume claims are also defined with YAML files. An example of a claim on the above PV is below. Note that the claimRef attribute from above reserves the PV for the PVC below. Also note that the volumeName attribute below explicitly claims the above PV. It these attributes are not included, a PVC will claim any PV that meets its specifications. 

 

PVC Claiming a PVPVC Claiming a PV

 

Defining Volume Mounts

The final step mounts the PVC to a local path within the pod. Again, an example YAML file is below. Note that the PVC we created above, nfs-pvc, is mounted to the /usr/data/ path within the pod. 

 

volumeMountvolumeMount

 

There is, of course, much more to Kubernetes PVs and PVCs. For an in-depth understanding of Kubernetes persistent volumes, see this video.

.

Defining Volumes Resources within a Pod

The other common alternative is to simply define an external storage resource as a volume directly within a pod and use a volumeMount to link the internal path to the external resource. While this may seem more convenient, the storage resource volume does not persist beyond the life of the pod and the resource is not available outside of the pod.

 


Direct Volume MountDirect Volume Mount

 

Making the Viya Connection

Once you understand how kubernetes volumes work, you can explore your containerized Viya env with the kubectl command line tool (get pvc, get pv, describe pod, describe all and other commands) or the Kubernetes web ui to see where the internal paths of your caslibs and/or libnames point externally. Then you can tell the data source admin, "drop the data in there." .

For More

For more, on how to explore and customize your SAS Viya Kubernetes environment, see Gerry's article here. For more on Kubernetes volumes, persistent volumes, persistent volume claims, pods, and more, see the API documentation.

Version history
Last update:
‎02-25-2021 09:17 AM
Updated by:
Contributors

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags