Python parallelism cheat sheet

I often get asked “how can I parallelise my Python code?". I’ve come up with this simple cheat sheet to explain it. I will only explain the most common method of parallel problems here: embarrassingly parallel problems.

This blog post is the first in a series I am writing, covering methods of simple parallelism. The following posts cover more convenient methods, as well as some things that should be considered.

Parallelism methods

If I’ve skipped your favourite method of parallelism, feel free to tweet me or add a comment on the tracking issue informing me.

I’ve created an example notebook which can be used as a base.

Throughout, we shall be referring to this code example. It’s basic, but illustrates the procedure:

avalues = range(20)
bvalues = range(100, 120)
const = 100

results = []
for a, b in zip(avalues, bvalues):
    # pretend this computation is much more expensive
    results.append((a + b) * const)
print(results)

Step 1

Find the part to parallelise.

In most of my applications, typically this is a single for-loop which performs lots of work for each iteration. In the code example, it is the loop over a and b:

for a, b in zip(avalues, bvalues):
    results.append((a + b) * const)

Step 2

Find out what data changes for each iteration of the loop

Typically the loop iterates over something. It may be more than one value. In the example, it’s a and b.

Step 3

Find out what data does not change for each iteration of the loop

This data should ideally be read only. In the example it’s const.

Step 4

Convert your loop into a function that takes a single argument of things that vary, and other parameters of fixed data

The way python's multiprocessing works, it’s generally easiest to use multiprocessing.Pool.map, but this takes a function with a single argument. We’ll sort this out in the next step. It’s extremely important to make sure your code works as before after this conversion. This is the source of most errors.

Applying this process to the example:

def worker_fn(changing_stuff, const_value):
    '''A function that takes as it's first parameter
    the values that change for each loop iteration, and
    the remaining parameters do not change with each
    loop iteration
    '''
    a, b = changing_stuff
    return (a + b) * const_value

avalues = range(20)
bvalues = range(100, 120)
const = 100

results = []
for a, b in zip(avalues, bvalues):
    # call the new function
    result = worker_fn((a, b), const)
    results.append(result)
print(results)

Step 5

Create a partially applied function, using functools.partial

This is how we get around the single argument problem, we “bake in” the constant arguments into a new function which takes a single argument:

from functools import partial


def worker_fn(changing_stuff, const_value):
    '''A function that takes avalues it's first parameter
    the values that change for each loop iteration, and
    the remaining parameters do not change with each
    loop iteration
    '''
    a, b = changing_stuff
    return (a + b) * const_value


# ...
const = 100

# this function now takes only a single argument
fn = partial(worker_fn, const_value=const)

# ...

Step 6

Now replace the for loop with the common pool set up:

from functools import partial
from multiprocessing import Pool  # 1


def worker_fn(changing_stuff, const_value):
    '''A function that takes as it's first parameter
    the values that change for each loop iteration, and
    the remaining parameters do not change with each
    loop iteration
    '''
    a, b = changing_stuff
    return (a + b) * const_value

avalues = range(20)
bvalues = range(100, 120)
const = 100

fn = partial(worker_fn, const_value=const)

# this must be an `iterable`, so a list or generator
zipped_args = zip(avalues, bvalues)  # 2

# Python 3
with Pool() as pool:  # 3
    results = pool.map(fn, zipped_args)  # 4

# or if you're stuck with Python 2:
pool = Pool()
results = pool.map(fn, zipped_args)  # 4

import the Pool object, which defaults to one process per cpu
create an iterable of your arguments which vary per loop iteration. This can either be a list or generator. Look up the zip documentation for more information.
Using the Pool as a context manager cleans up the processes after use
We call the map function, which applies a function taking one argument, to an iterable of things.

With these changes, your code should be parallelised! Of course things may not be that easy. Here are some common gotchas:

The biggest problem is with pickling. This is the process of converting python code to a string, to give to the other processes running your code. Lots of things can be pickled in Python, but some important ones that can’t are:
- class instances, so you cannot use an instance method as your worker function
- open files or sockets
- if you find yourself with tracebacks that contain “pickle” then this is your problem
Exceptions in processes cause wierd behaviour. Newer versions of Python are better at this than older ones (*cough* python 2 *cough*), but generally you cannot catch exceptions thrown from other processes. Make your worker_fn bulletproof.
A consequence of the previous point is that trying to Ctrl-C out of a parallelised piece of code does not work, because the processes receive the KeyboardInterrupt error, but they do not synchronise and exit sanely. This is possibly fixed in Python 3.? but I don’t quite know.

Many thanks to James McCormac for helpful suggestions.

2017-01-10

https://blog.simonrw.com/posts/2017-01-10-python-parallelism-cheat-sheet/

#python