Comparing Prime Computations Using the New CAS Gateway Action Set

1 Like

The CAS Gateway Action Set describes a set of actions in CAS that allows users to submit code written in other languages, such as Python or R, to the CAS server. It is similar to the “PROC Python” or “PROC SQL” SAS procedures in that non-SAS code is submitted, a language is specified, and the server processes the submitted code. However, users of the CAS Gateway Action Set can theoretically leverage the parallel processing ability of CAS to speed up their non-SAS code; doing so would provide significant benefits over the procedure-based invocations of a language like Python in that parallelization is much easier to implement. We became interested in this new feature and decided to test its efficiency using the example of prime number computation: a simple task for both SAS and Python that gets progressively more difficult as the number of processed entries increases.

Foundational Material and Project Code

This project is built in-part on components of this prime number computation CAS post.

CAS Gateway Prime Tests Code | Github Repository

For those who view the code there’s an additional test outside the scope of this post that tackles multithreaded execution in a Python procedure. To better understand CAS and parallelization vs multithreading check out this post.

Test Background

Six main tests were performed, each with the goal of picking prime numbers out of a number line from 1 to X:

Test Type	Input & Output Tables	CAS	Compute Server	SAS/CONNECT
SAS/BASE Code	No	No	Yes	No
SAS/BASE Code	Yes	No	Yes	No
Proc Python	No	No	Yes	No
Proc Python	Yes	No	Yes	No
CAS Gateway	No	Yes	No	No
CAS Gateway	Yes	Yes	No	No

Two related tests from other posts:

Test Type	Input & Output Tables	CAS	Compute Server	SAS/CONNECT
CAS Code (Post)	Yes	Yes	No	No
SAS/BASE Code (Post)	Yes	No	Yes	Yes

Tests are performed on our own virtualized SAS Viya 4 environment. Results may vary in other environments with different topologies but relative gains or losses observed in these tests should give a good idea of the benefits of CAS Gateway. The CAS session we’re running on is configured with 3 CAS workers. All non-CAS tests are running on a SAS compute server (this includes the Python procedures). Each computing node is utilizing 64Gb of RAM and an Intel Xeon Gold CPU with a speed of 3.10GHz and 8 cores, with 1 thread per core.

The tests with input and output tables are probably more in-line with real world use cases of SAS. In those tests, a table of numbers is created and we iterate through the table and update it to reflect whether each number is “prime” or “not prime”, then output the resulting table. “Computation only” tests iterate through a loop of numbers, decide if they’re prime, and records that to a SAS table that doesn’t get output. These computation tests are worth considering as they theoretically would take a substantial measure of I/O out of the equation (writing to / reading from tables).

It’s important to note that both SAS code and the Python procedure are run on the standard SAS Programming Run-Time Environment; Whereas CAS Gateway submissions will run on CAS nodes which allow for multiprocessing.

In our findings output table: each test outputs the type of program that was ran, the amount of numbers processed, and the [dur]ation the program ran for (in seconds). Individual rows in the testing output table are non-concurrent in that the same Python test for 10,000 and 1,000,000 searched numbers, for example, would not run at the same time.

The Findings

SAS code (computation only):

Our computation-only SAS code provides a basis for the greater CAS Gateway program efficiency test. As mentioned before, the time prime number discovery takes to execute increases dramatically as the number of observations rise.

SAS code (with input and output tables):

Negligible differences are observed between calculating primes in SAS with or without tables. Over multiple tests the average duration for each row comes out to be nearly the same. In this case the 2nd test appears to be slightly faster for 10000000 observations, whereas in other iterations of the same test the opposite may be true.

PROC Python submitted Python code (computation only):

In this SAS optimized environment we see an increase in program run time for the PROC Python code. In these tests we aren’t comparing SAS code VS Python code directly, as algorithms used are slightly different and the SAS environment will be better optimized for SAS code. Still, keep in mind these numbers when we make comparisons to the parallel processing of CAS in the next outputs. It’ll be useful to compare our Python procedure with a CAS Gateway execution of Python.

PROC Python submitted Python code (with input and output tables):

The dramatic rise in time in these tests when compared to the computation-only Python code is likely due to input and output for the data tables. Python has a number of different methods for processing data transformations in tables. The one we went with might not have necessarily been most optimal for python, but the code solution we went with was easy to implement and closest to our SAS code in structure.

CAS Gateway submitted Python code (computation only):

Because CAS Gateway computation was so quick we were able to add another row and bump observations by an extra order of magnitude: up from ten million to one hundred million.

CAS Gateway submitted Python code (with input and output tables):

Again, there are huge gains between the Python procedure durations and CAS Gateway durations. CAS Gateway clocks in at over 10 times faster in later cases thanks to parallel processing and multithreading inherent to CAS, which the CAS Gateway Action Set allows our earlier Python code to take advantage of.

Conclusions

Our CAS Gateway code runs the same Python algorithm for calculating primes as the Proc Python code. The CAS version is understandably faster as it can leverage CAS resources for parallel processing and multithreading, but is it worth the time it takes to reformat the non-CAS code into CAS code? In this case, absolutely! It’s surprisingly easy to take a Python procedure and convert it into CAS eligible code for the CAS Gateway Action Set. Provided your threads running on CAS don’t need to talk to each other – such as with these tests – the core of your code doesn’t need to change at all. For this example, a few minutes of adding surrounding CAS Gateway code equates to gains of 10x or more on this environment. Results may vary across different operations, datasets, and environments, but the point is you can get massive returns for minimal effort.

Related Links:

Using MPP CAS Multi-Threading to boost prime number computing speed when using the brute force metho...

Base SAS + SAS/CONNECT – A simple method to generate load on any number of licensed cores

CAS Gateway Prime Tests Code | Github Repository

Multithreading, Parallelism, Python, and SAS

Find more articles from SAS Global Enablement and Learning here.

SAS Communities Library