In the world of AI, complex algorithms combined with big data require computing power to meet the challenge. GPUs help provide that power. In SAS Viya Stable release 2021.1.4 (August 2021), we began supporting the use of GPUs for training and scoring reinforcement learning models.
GPUs are smaller and faster than CPUs. They can perform millions of mathematical operations in parallel.
As video game popularity increased in the 1980s and 90s, computer microprocessors improved to accommodate more and more advanced graphics demanded by gamers.
Graphics manipulation requires that millions of points be handled simultaneously. A GPU can have hundreds of cores to enact massively parallel processing on a single task, such as manipulating an image. NVIDIA (now a SAS partner) was one of the first to use GPUs to handle the massively parallel operations needed to for smooth video display. The GPU can divide up the image and simultaneously process each section of the image. This provides the mammoth processing power needed to render the hundreds of images per second needed for seamless motion on 3D images, as required in video games.
But GPUs aren’t just for video games! Deep learning algorithms used in machine learning and artificial intelligence are also notorious for devouring computer processing time. These algorithms require tremendous numbers of matrix multiplication operations per second. See the graphic below showing petaflops/day needed to train various deep learning models.
By 2009, data scientists had discovered the advantages of GPUs for running power-hungry deep learning algorithms. Andrew Ng of Stanford University demonstrated that 64 GPUs could adequately replace 16,000 CPUs in the Google X Project.
By 2020 additional advances meant GPUs gained an even greater lead over CPUs. Particularly for repetitive operations on large data, GPUs perform inordinately better than CPUs. Data scientists also rewrote algorithms to allow them to take better advantage of parallel processing. GPUs now allow for near-real time machine learning and AI.
GPUs are so fast because they are so efficient for matrix multiplication and convolution. GPUs are optimized for calculations commonly and repeatedly required for computer graphics and SIMD (single instruction, multiple data) operations. They have:
GPU components are tiny. One may measure as little as 40 nanometers across—less than one-thousandth the diameter of a single hair off of my head. Nvidia’s GPUs come with 3,584 cores while Intel’s top end server CPUs may have a maximum of 28 cores. Plus each GPU core can handle thousands of threads simultaneously.
One of the reasons that GPUs perform so well is memory bandwidth. GPUs are bandwidth optimized whereas CPUs are latency optimized. GPUs are good at fetching large amounts of memory but CPUs can fetch small amounts of memory more quickly. GPUs offset their latency “deficiency” by running operations on many parallel threads at the same time.
Here at SAS, Frederik Vandenberghe tested the speed of fitting a YOLO model and found a significant time savings (from about 12 seconds to about half a second). For details see Enabling GPUs on a SAS VIYA Container by Frederik Vandenberghe.
SAS supports the use of GPUs for reinforcement learning and recurrent neural networks.
SAS began supporting the use of GPUs for training and scoring reinforcement learning models in SAS Viya stable release 2021.1.4 (August 2021). Recall that reinforcement learning is a machine learning model that maximizes a long-term reward accumulated over a sequence of actions. The reward signal is often a scalar function that indicates the goodness or badness of the agent’s decisions. The long-term reward is maximized through an iterative trial and error process. For more about reinforcement learning with SAS, see my previous article, Reinforcement Learning with SAS VDMML.
The SAS reinforcement learning action set supports GPUs to speed up model training and scoring. By default, no GPU is used. To use a GPU, specify the gpu parameter with the enable subparameter=True. If you are running CAS servers running in MPP mode, only the GPU on the worker 0 node is used.
By default, training and scoring with a GPU is nondeterministic. However, by specifying deterministic=True, you can force the action to behave deterministically. However, this will slow down your performance.
In the current VDMML stable release 2021.1.5, SAS reinforcement learning uses CUDA 10.2. NVIDIA is a SAS partner, and both NVIDIA V100 and P100 GPUs are supported.
SAS also supports using GPUs for recurrent neural networks to reduce processing time. The underlying algorithm is slightly different when on GPUs than when on CPUs. There are a number of requirements, including:
See the documentation for the full list of requirements.
As of the current release (2021.1.5), the SAS Deep Learning toolkit does not support portability between the CPU and the GPU for GRU networks. For more details see the documentation.
You may want to train your RNN on a GPU and then score new data on a CPU (or vice versa). If so, see the specific guidelines in the documentation.
MPP mode. In MPP mode, all GPU devices on the same compute node must be homogeneous, i.e., the same model. On different compute nodes, they do not have to be the same. As a matter of fact, some compute nodes can be CPU-only and some compute nodes can be GPU-only. Synchronous gradient descent is the supported optimization method.
Energy use. GPUs require more energy than CPUs. They commonly have power consumptions over 300 watts. GPUs consume so much energy because they have a large number of transistors switching at high frequency. In computing centers this can create issues with power supply and thermal dissipation.
Unsupported operations. Some SAS Deep Learning operations are not supported on GPUs, as of stable version 2021.1.5:
If you train a SAS Deep Leaning Network on a GPU with any of these operations, you will engage a mixed CPU/GPU mode and these operations will run on a CPU.
Sequential versus parallel processes. GPUs are good at massively parallel processing. But not all algorithms can be fully parallelized. For some algorithms, CPUs will actually perform better.
GPU architectures perform much better than CPUs on processes that have little to no branching conditions or data dependencies.
SAS supports GPUs for a variety of deep learning capabilities and now supports them for reinforcement learning! Be careful to check the documentation for specifically allowed GPUs for your software version and check the list of requirements to make sure they will work appropriately for your needs.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.