Enabling GPUs on a SAS VIYA Container

4 Likes

Enabling GPUs on a SAS VIYA Container

You might have already heard about the deep learning capabilities within SAS Visual Data Mining and Machine Learning and with SAS Analytics for IoT. And although a graphical processing unit (GPU) is not required on your computer in order to use these deep learning features, a GPU provides additional functionality, such as heavy analytics processing speed.

I wanted to test and see with my own eyes what added value GPUs bring. And I wanted to try things out on a GPU-enabled SAS Viya Programming-only container on the Google Cloud Platform (GCP). In this article, I’ll cover setting up a cloud virtual machine (VM), as well as deploying the SAS Viya container and other required components. The result will give you an idea of how much GPUs boost model training time.

Assumptions

You are familiar with Docker and container technologies
You have a valid SAS License to make this work

Cloud environment setup

I started with reserving a n1-standard-32 (32 vCPUs, 120 GB memory) GCP virtual machine instance. I added two graphics processing units (GPUs) to my virtual machine instance. For demonstration purposes and cost considerations, I decided on a preemptible VM. A preemptible VM is an instance that you can create and run at a lower price than normal instances. The disadvantage is that GCP Compute Engine might terminate (preempt) these VM instances if it requires access to those resources for other tasks. But for testing purposes, this is perfectly fine.

Component setup

Install the following components on the GPU VM instance:

Docker following the instructions here.
The NVIDIA driver using these instructions.
The NVIDIA Container toolkit as described here.

Confirm the prerequisites, by running the following command:

docker run --gpus all nvidia/cuda:9.0-base nvidia-smi

You should see the available GPUs for a running container:

Build container image

When all pre-reqs are met, you’re ready to build the GPU-enabled SAS Viya Programming-only analytics for container (SAS Viya POAC) image. I started from a Dockerfile that was developed by my colleague Michael Gorkow. I slightly modified the Dockerfile and it’s available here on GitHub. The resulting Docker container image will contain SAS Viya and other essential components. I can’t discuss the entire Dockerfile in this short blog, but here are a few tips:

It’s important to start from this base container image:

FROM nvidia/cuda:10.1-devel-centos7

Make sure to add a CMD line to install required dependency packages:

RUN yum -y update && yum install -y epel-release \
&& yum install -y gcc wget git python-devel java-1.8.0-openjdk glibc libpng12 libXp libXmu numactl xterm initscripts which iproute sudo httpd mod_ssl && yum -y install openssl unzip openssh-clients bind-utils openssl-devel deltarpm libffi-devel net-tools sudo \
&& yum -y groupinstall "Development Tools" \
&& yum clean all

Ansible is required; add the following line to your Dockerfile:

RUN wget https://bootstrap.pypa.io/get-pip.py && python get-pip.py && pip install ansible==2.7.12

As recommended here, I created a SAS Viya mirror repository beforehand. Once the repository is downloaded, run a nginx container, so the mirror is served with a web server:

docker run --restart=always --name nginx_mirror5 -v /mirrorlocation/:/usr/share/nginx/html:ro -p 9125:80 -d nginx_with_dirlist

Install SAS Viya. That could be something like:

# Add deployment data zip to directory
RUN mkdir -p /opt/sas/installfiles
WORKDIR /opt/sas/installfiles
ADD SAS_Viya_deployment_data.zip /opt/sas/installfiles
# Get orchestration tool and install.  Then build and untar playbook
ADD sas-orchestration /opt/sas/installfiles
RUN /opt/sas/installfiles/sas-orchestration build --platform redhat --deployment-type programming --input SAS_Viya_deployment_data.zip --repository-warehouse http://x.x.x.x:9125/  && tar xvf SAS_Viya_playbook.tgz
WORKDIR /opt/sas/installfiles/sas_viya_playbook
RUN mv /opt/sas/installfiles/sas_viya_playbook/inventory.ini /opt/sas/installfiles/sas_viya_playbook/inventory.ini.orig
RUN cp /opt/sas/installfiles/sas_viya_playbook/samples/inventory_local.ini /opt/sas/installfiles/sas_viya_playbook/inventory.ini
RUN sed -i "/ notify/,+9d" roles/httpd-x64_redhat_linux_6-yum/tasks/configure-and-start.yml && \
sed -i "s/- include: validate/#/" internal/deploy-preinstall.yml && \
	ansible-playbook site.yml -vvv && \
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/cas/default/cas.hosts && \ 
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/cas/default/casconfig_deployment.lua && \
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/cas/default/cas.yml && \
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/cas/default/cas.hosts.tmp && \
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/batchserver/default/autoexec_deployment.sas && \
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/sysconfig/cas/default/sas-cas-deployment && \
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/sysconfig/cas/default/cas_grid_vars && \
	sed -i "s/$(hostname)/localhost/g" /opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas && \
	sed -i "s/$(hostname)/localhost/g" /etc/httpd/conf.d/proxy.conf

Inside the Dockerfile we have to add Anaconda and JupyterLab, together with SAS python-swat and the SAS python-dlpy package. These library tools allow SAS Viya code testing from sample Jupyter notebooks.

RUN yum install -y wget bzip2 ca-certificates \
    libglib2.0-0 libxext6 libsm6 libxrender1 \
    git mercurial subversion && yum clean all
RUN echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh && \
    wget --quiet https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh -O ~/anaconda.sh && \
    /bin/bash ~/anaconda.sh -b -p /opt/conda && \
    rm ~/anaconda.sh
ENV PATH /opt/conda/bin:$PATH

Install python packages with anaconda

RUN conda install -c anaconda seaborn -y \
	&& conda install -c conda-forge sas_kernel -y \
	&& conda install -c anaconda pillow -y \
	&& conda install -c conda-forge matplotlib -y \
	&& conda install -c anaconda graphviz -y \
	&& conda install -c conda-forge python-graphviz -y

Install swat, opencv (conda install does not seem to work). Hint: You might have to check for the most recent versions on GitHub.

RUN pip install https://github.com/sassoftware/python-swat/releases/download/v1.6.1/python-swat-1.6.1-linux64.tar.gz && pip install opencv-python && pip install sas-dlpy

Configure Jupyter

RUN pip install jupyterlab && \
	jupyter notebook --generate-config --allow-root && \
	echo "c.NotebookApp.token = u''" >> /root/.jupyter/jupyter_notebook_config.py && \
	echo "c.NotebookApp.ip = '*'" >> /root/.jupyter/jupyter_notebook_config.py && \
	echo "c.NotebookApp.notebook_dir = '/data'" >> /root/.jupyter/jupyter_notebook_config.py

Run Docker container

When you have successfully built the Docker container image, it’s time to spin up a running container. Don’t forget to enable the GPUs.

docker run -d --gpus 2 -P -v ff -v /data/yolo:/data sas_viya_gpu_03022020:v3_5

I tested the container for the following use-case: Manufacturing Quality Inspection is resource inefficient, prone to human error and thus, costly. Manufacturers are increasingly looking for methods to automate the process in a manner that can match the increasing demands in terms of quality and quantity.

My colleagues Xiasi Liu, Xin Ru Lee and Leon Jordan Francisco demonstrated how the SAS platform can be leveraged to operationalize (train, deploy and consume results) computer vision models. They developed a defect detection model with information from the predicted bounding boxes, that were being processed to provide insights on the location and severity of a defect.

I received some sample code from the team on how they trained a YOLO model. By slightly modifying the code, I was able to activate the available GPUs.

yolo_model.fit(
          data='tr_img', 
          optimizer=optimizer, 
          data_specs=data_specs, 
          n_threads=8, 
          record_seed = 13309,
          force_equal_padding=True,
          gpu=1)

The results of the model training are highlighted below. The training of the model was done significantly faster with the GPUs:

# Run without GPUS
NOTE:  Epoch Learning Rate        Loss        IOU   Time(s)
NOTE:  39       0.0001           1.109     0.7409    12.04
NOTE:  The optimization reached the maximum number of epochs.
NOTE:  The total time is     490.15 (s).

# Run with GPUs activated
NOTE:  Epoch Learning Rate        Loss        IOU   Time(s)
NOTE:  39       0.0001           1.389     0.7744     0.53
NOTE:  The optimization reached the maximum number of epochs.
NOTE:  The total time is      22.87 (s).

If you want, you can test some Python notebook examples running on a the GPU-accelerated Viya container, model2 is especially interesting, as it consumes GPU.

Credits go to my colleagues who provided me the necessary components to make this test possible.

mentos05 · ‎03-13-2020

Nice and comprehensive article!
And always impressive to see how much GPUs accelerate the training of CNNs. 🙂

Best Regards

Michael Gorkow

BethEbersole · ‎09-09-2021

Excellent article, Frederik!

TonyMayo · ‎01-30-2023

LOVED IT!!!!

Outstanding just what I ACCTUALLY need to know.

Thank you,