Python parallelism cheat sheet
I often get asked “how can I parallelise my Python code?". I’ve come up with this simple cheat sheet to explain it. I will only explain the most common method of parallel problems here: embarrassingly parallel problems.
This blog post is the first in a series I am writing, covering methods of simple parallelism. The following posts cover more convenient methods, as well as some things that should be considered.
Parallelism methods
If I’ve skipped your favourite method of parallelism, feel free to tweet me or add a comment on the tracking issue informing me.
I’ve created an example notebook which can be used as a base.
Throughout, we shall be referring to this code example. It’s basic, but illustrates the procedure:
avalues = range(20)
bvalues = range(100, 120)
const = 100
results = []
for a, b in zip(avalues, bvalues):
# pretend this computation is much more expensive
results.append((a + b) * const)
print(results)
Step 1
Find the part to parallelise.
In most of my applications, typically this is a single for
-loop which
performs lots of work for each iteration. In the code example, it is the
loop over a
and b
:
for a, b in zip(avalues, bvalues):
results.append((a + b) * const)
Step 2
Find out what data changes for each iteration of the loop
Typically the loop iterates over something. It may be more than one
value. In the example, it’s a
and b
.
Step 3
Find out what data does not change for each iteration of the loop
This data should ideally be read only. In the example it’s const
.
Step 4
Convert your loop into a function that takes a single argument of things that vary, and other parameters of fixed data
The way python
's multiprocessing works, it’s generally easiest to use
multiprocessing.Pool.map
, but this takes a function with a single
argument. We’ll sort this out in the next step. It’s extremely important
to make sure your code works as before after this conversion. This is
the source of most errors.
Applying this process to the example:
def worker_fn(changing_stuff, const_value):
'''A function that takes as it's first parameter
the values that change for each loop iteration, and
the remaining parameters do not change with each
loop iteration
'''
a, b = changing_stuff
return (a + b) * const_value
avalues = range(20)
bvalues = range(100, 120)
const = 100
results = []
for a, b in zip(avalues, bvalues):
# call the new function
result = worker_fn((a, b), const)
results.append(result)
print(results)
Step 5
Create a partially applied function, using functools.partial
This is how we get around the single argument problem, we “bake in” the constant arguments into a new function which takes a single argument:
from functools import partial
def worker_fn(changing_stuff, const_value):
'''A function that takes avalues it's first parameter
the values that change for each loop iteration, and
the remaining parameters do not change with each
loop iteration
'''
a, b = changing_stuff
return (a + b) * const_value
# ...
const = 100
# this function now takes only a single argument
fn = partial(worker_fn, const_value=const)
# ...
Step 6
Now replace the for loop with the common pool set up:
from functools import partial
from multiprocessing import Pool # 1
def worker_fn(changing_stuff, const_value):
'''A function that takes as it's first parameter
the values that change for each loop iteration, and
the remaining parameters do not change with each
loop iteration
'''
a, b = changing_stuff
return (a + b) * const_value
avalues = range(20)
bvalues = range(100, 120)
const = 100
fn = partial(worker_fn, const_value=const)
# this must be an `iterable`, so a list or generator
zipped_args = zip(avalues, bvalues) # 2
# Python 3
with Pool() as pool: # 3
results = pool.map(fn, zipped_args) # 4
# or if you're stuck with Python 2:
pool = Pool()
results = pool.map(fn, zipped_args) # 4
- import the
Pool
object, which defaults to one process per cpu - create an iterable of your arguments which vary per loop iteration.
This can either be a list or generator. Look up the
zip
documentation for more information. - Using the
Pool
as a context manager cleans up the processes after use - We call the
map
function, which applies a function taking one argument, to an iterable of things.
With these changes, your code should be parallelised! Of course things may not be that easy. Here are some common gotchas:
- The biggest problem is with pickling. This is the process of
converting python code to a string, to give to the other processes
running your code. Lots of things can be pickled in Python, but some
important ones that can’t are:
- class instances, so you cannot use an instance method as your worker function
- open files or sockets
- if you find yourself with tracebacks that contain “pickle” then this is your problem
- Exceptions in processes cause wierd behaviour. Newer versions of
Python are better at this than older ones (*cough* python 2 *cough*),
but generally you cannot catch exceptions thrown from other processes.
Make your
worker_fn
bulletproof. - A consequence of the previous point is that trying to Ctrl-C out of a
parallelised piece of code does not work, because the processes receive
the
KeyboardInterrupt
error, but they do not synchronise and exit sanely. This is possibly fixed in Python 3.? but I don’t quite know.
Many thanks to James McCormac for helpful suggestions.