Multithreading, Parallelism, Python, and SAS

2 Likes

Both multithreading and parallel processing (parallelism) are hot topics in the data analysis space. But what is the difference between the two concepts? Are there any differences at all? When people compare SAS and Python’s parallel or multithreaded computing ability, what do they mean? Hopefully this post can contextualize these phrases and how they can be used to optimize both SAS and Python programs.

Multithreading vs Parallel Processing

When you run a simple program, like a program to count from 1 to 1 billion, you are running a program on a single thread without any parallel processing. A program of that nature might look something like this in Python:

def count(firstnum, lastnum):

num = firstnum

while (num <= lastnum):

num += 1

count(1, 1000000000)

print(“Done!”)

Let’s imagine we split this code into two equally-sized chunks:

Code that counts from 1 to 500 million

Code that counts from 500 million to 1 billion

We’re effectively splitting the program into processes, and if we ran both processes at the same time, we would expect it to take about half as long to complete all the counting. Executing these 2 processes in parallel should accomplish this, but executing these processes on 2 different threads could only serve to slow down our counting program. Why is this? Multithreading and Parallel processing are not the same thing, despite their apparent similarity in “dividing and conquering” a given workload. Both can speed up certain tasks, but through separate means.

Multithreading, or the process of running one program across multiple threads, is all about concurrency. Concurrency refers to the act of splitting a problem into smaller parts that can be worked on individually. In “doing” multithreading with this counting program, we’d start by splitting its execution into two halves (and placing these two halves on two different threads). The threads would execute concurrently, but this is not the same thing as executing simultaneously. If we output each process’s “count” number every time it’s incremented, we might get a messy output that looks something like this:

1

500000001

2

3

500000002

4

500000003

500000004

500000005

5

6

…

The program switches between our two different threads very quickly and tries to reduce the idle time of our CPU by having threads execute that are “ready”. In this case each thread has no measurable idle time with the negligible input/output of a counting program, so both are equally “ready” to resume counting. This all means that our counting program isn’t any faster multithreaded, and in fact the act of splitting into two threads could introduce a small amount of extra work that could raise the execution time. Here’s three outputs with the program on one thread, two threads, and four threads respectively:

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

The time varied between 28 and 36 seconds with a similar mean time of 32 shown here on each program’s output. As we can see, the splitting of our program into threads did not reduce the overall amount of time to complete the whole program.

However, multithreading can speed up a program if you’re dealing with latency. For example: when surfing the web a lot of things are happening to load a website at once; user input is checked, objects on a webpage have to load, code is run for personalization, ads are fetched, etc. In a multithreaded approach we can do all of these “things” whenever they’re ready and do other things when they aren’t. Another great use case for multithreading would be manipulating data in a table. There’s time between the input and output of items in a data table and using a multithreaded approach we can have the CPU essentially switch over to whichever thread is ready to execute on the table at a given time.

Parallel processing differs from multithreading in that our initial example of two “count” processes can actually run simultaneously. This is accomplished by running on two separate cores as opposed to the two threads on one core that we’d get with multithreading. The thing that most often confuses people is that technically you are still running on two different threads here, but the operative word is “core”. Splitting the processes into two different cores allows for truly simultaneous execution of each process, potentially halving the execution time in this particular case.

Multithreading and Parallelism in SAS

SAS9 has had multithreading and parallel processing capabilities for quite some time. Here is a link to SAS procedures in SAS Foundation that currently leverage multi-threadiness. An article from the 2010 SAS Global Forum talks about both concepts: here. The article specifically details running multiple parallel batch SAS sessions to work with many chunks of a data table, and joining the output of those processes after each is completed. While I was playing at recess in elementary school (2010), SAS employees were hard at work getting I/O parallelism to improve data step computation speeds.

SAS/CONNECT is the next logical step from SAS9 parallelism, it’s a set of tools that links a SAS client session to a SAS server session for you. This can allow for easier parallel processing as well as other features that are outside the scope of this post. For more information pertaining to SAS/CONNECT, please refer to the documentation.

In a previous post, I used CAS – the SAS Viya Cloud Analytics Service – to compute a large volume of prime numbers. The two main types of CAS are SMP (symmetric multi-processing) and MPP (massively parallel processing). The difference is the number of machines connected to your client, where SMP has one machine and MPP has multiple machines. The program I made on MPP CAS works by sending data to 3 CAS workers with 8 threads each, for a total of 24 threads. This approach uses parallelism in that the three CAS workers are different compute cores that can generate prime numbers simultaneously. It also uses multithreading to minimize latency in the input and output data tables by splitting the process into 8 threads across the multiple CPU cores. MPP CAS automatically deals with the parallel execution bit as long as I split my program into the desired number of threads (in this case, 24). In this sense, both parallelism and multithreading are baked into Viya – and by extension, SAS.

Can Python Use These Techniques?

In short, yes. Python can run on multiple threads or cores, but modules can be installed to enable this type of behavior. “Threading” is the module that allows for multithreading, and “multiprocessing” allows for parallel processing. Both can be installed and imported into most python environments, as they’ve existed for quite a while. Threading and multiprocessing have been around since Python 2.6, and the current standard release of today is 3.12. A breakdown of Python threading and multiprocessing can be found in this YouTube video.

Using a simple Python program, I was able to get multiple threads working. If you know of more complex examples of multithreading in Python, be sure to share the links in the comments of this post.

Python has something called the GIL or “Global Interpreter Lock” which locks python to execute in a single thread. This poses no issues for python multithreading as a multithreaded environment can only have one thread “active” at a time anyway. However, the multiprocessing module must avoid the GIL by splitting processes into completely separate “subprocesses” instead of threads. These individual processes must be handled by the python programmer in their code. The examples of this I’ve seen look easy enough, but manipulation of a data table in parallel stumped me as someone unfamiliar with python parallelism. Nevertheless, if you knew what you were doing you could feasibly create powerful parallel python programs using this module.

SAS Can Also Help Python Parallelize

Be on the lookout for a post by David Estreich and me, coming soon. It’ll explore the CAS Gateway Action Set, which can allow parallel execution of Python processes in a way that might be more familiar to SAS users.

Related Links:

The Moth - Threading/Concurrency vs Parallelism (danielmoth.com)

SAS Help Center: Using Parallel Processing

SAS Help Center: What Is Threading Technology in SAS?

SAS Help Center: Threading in Base SAS

Multithreading and concurrency fundamentals (educative.io)

217-29: Threads Unraveled: A Parallel Processing Primer (sas.com)

Seriously Serial or Perfectly Parallel Data Transfer with SAS Viya - Rob Collum

SAS Help Center: SAS Cloud Analytic Services: Fundamentals

Using MPP CAS Multi-Threading to boost prime number computing speed when using the brute force metho...

109-2010: Faster Results by Multi-threading DATA Steps (sas.com)

SAS Help Center: What is SAS/CONNECT?

threading — Thread-based parallelism — Python 3.12.5 documentation

multiprocessing — Process-based parallelism — Python 3.12.5 documentation

Parallel Processing in Python - GeeksforGeeks

https://www.youtube.com/watch?v=AZnGRKFUU0c

Find more articles from SAS Global Enablement and Learning here.

ronan · ‎10-23-2024

@RyanKing Thanks a lot for sharing this topic, too often overlooked, which is crucial in the real life. This post is very clear and engaging. Eagerly waiting for the next article !

RyanKing · ‎10-23-2024

@ronan Much appreciated! The wait shouldn't be too long, looking to get it out before the end of the year.

SAS Communities Library