Modern Python development
Python dependency management is difficult.
…not because it’s complicated, but rather because the ecosystem has been around for so long, and evolved over time. This means that best practices have come and gone. With that a great deal of confusion has developed, from reading out of date blog posts or stack overflow advice.
I am here to hopefully simplify the process into the following steps:
- Install Python:
asdf
- Managing global packages:
pipx
- Install a Python dependency manager:
poetry
- Best practices
- My workflow
Throughout this post, I will explain my reasoning behind my suggestions, so that hopefully you will understand why I have made my choices.
The Python ecosystem. Items in green represent suggested tools. Items in red represent parts of the system that should not be used.
Install Python
pyenv
allows multiple versions of Python to be installed at
the same time, and selected per directory. This means that if a project
depends on Python 3.6 and another depends on Python 3.8, the two can be
designated per project.
asdf
can be thought of as pyenv
but for many languages, and
so is interchangeable.
The main reason I suggest using one of these two package managers is then your install isn’t tied to the system Python.
On macOS it’s tempting to use homebrew
to install your
Python version. This is not a good idea. homebrew
regularly updates the python
formula to include the latest version. If
for example, a CLI package is installed - such as the AWS CLI, it’s tied
to a single global Python minor version. When homebrew
updates the Python formula, the AWS CLI will stop working.
Managing global packages
Suggestion: Use pipx
pipx
is a tool which installs a single binary - for example
the AWS CLI - into its own virtual environment. This isolates its
packages from everything else. When combined with asdf
, this
means that Python version upgrades are up to you, and therefore will not
break your packages at unpredictable times.
This isolation is especially true for Python dependency managers. They should be installed in a unique environment so that they don’t pick up packages from outside by accident.
Install a Python dependency manager
Suggestion: Use poetry
installed via pipx
Python has a long and confusing history when it comes to packaging. Unlike nodejs, go or rust, Python cannot import multiple versions of the same package at the same time.
Python includes a variable sys.modules
which is a dictionary of module
name to module. Even if a package could be installed using different
versions, Python would only refer to it once.
Restricting environments to use only a single package version means that all versioning constraints must be met for a project’s dependencies to be installed.
This all therefore means that we must:
- isolate projects from each other, so that invalid versions of packages from one project are not installed into other projects; and
- package dependencies must be strictly resolved, so that projects can include the correct versions.
This is implemented through “virtual environments”1, which configure a python interpreter to look in a specific location for packages. This means that a virtual environment for project A can include different packages than the virtual environment for project B.
Project isolation
Suggestion: Use poetry
If you are using pipenv, see the pipenv section.
Project dependencies should be considered in a separate way to global packages. The same requirement for isolation is shared however.
Project dependency management introduces an additional layer of
complexity: version locking. For application-type projects (CLIs, web
applications, anything that is not going to be imported as a package by
others), package versions should be locked to ensure reproducible
behaviour. Library type projects, which are designed to be imported by
others - for example requests
- should not lock their package
versions. The dependency specification of library type projects should
be as loose as possible, to reduce complications for projects that
import the library.
Note: this does not mean that version constraints cannot be used, but rather that they should be as loose as possible. Typically this means using greater-than constraints rather than less-than.
Two major project dependency managers exist for Python:
poetry
and pipenv
.
The behaviour of these two project dependency managers is approximately the same. They loose specification for project dependencies, which is then locked into a separate file, with exact version constraints. Projects can be set up to install dependencies from this lockfile only to ensure the same dependency versions each time the project is used.
Both projects create a virtual environment behind the scenes. This is
effectively invisible to the user. In order to run commands from within
that environment, the command must be prefixed by <pipenv or poetry> run <cmd>
. Both commands include a way to configure the current shell
to include those packages by default, for convenience: <pipenv or poetry> shell
.
The reasons to prefer poetry
over pipenv
may
include an element of subjectivity, but here I will remain objective.
The first advantage of poetry
is that the manifest format
used to define packages: a pyproject.toml
file. This converts an
imperative script that must be executed (setup.py
) in order to
determine a projects’ dependencies, to a static manifest that can be
cached, or parsed without installing the package itself. This manifest
file is already supported by pip > 19.0, from
2019 so legacy projects can
still use a dependency that has converted to pyproject.toml
. This
is not the case for pipenv
. Its package format is custom and
exclusive to pipenv
.
poetry
includes a custom dependency resolver, which is
faster than pipenv
‘s. This makes a difference when getting
up and running with a new project, but also when adding new packages.
When working on a project, typically you want the local package to be
installed in “editable” mode - i.e. a symlink, so that package changes
are picked up when running tests. An alternative is to use tox
,
however this is often slower than running tests against the local
package installed in editable mode, as a source package must be
generated and installed on each test run.
poetry
does this by default. You do not need to specify the
local package in the manifest file, and poetry
knows what to
do.
pipenv
supports installing the local package in development
mode, but in order to install the local package when “deploying”2,
the local package must be included in the [packages]
section. In my
experience, this does not work. I believe packages in the [packages]
section overwrites packages in the [dev-packages]
section, meaning
that the local package does not get installed in editable mode, and
changes are not reflected when running tests.
pipenv
does not have any support for publishing packages.
This means that if you use pipenv
to manage a library
project, you must use setup.py
to define your library dependencies,
rather than pyproject.toml
. poetry
supports building and
publishing packages as part of it’s default feature
set
Best practices
- Set the
PIP_REQUIRE_VIRTUALENV
environment variable totrue
. This will preventpip
from installing into the global package list, without a virtual environment. - Never install packages using
pip
into the global environment of a Python installation.pipx
enforces the use of virtual environments for CLIs, andpoetry
enforces the use of virtual environments for projects. - Never use sudo to install packages with
pip
. This practice usually indicates that you are doing something wrong. - If using
pipenv
, read the output of the command when it runs, to make sure you are using an expected version of Python for the virtual environment. See the pipenv section for more details. - Do not install packages with your system package manager (e.g.
apt-get
). These packages are usually installed only for the system Python, and you should not use the system Python. - If you find yourself using
pip install --user
then something is wrong. If you are installing a CLI, then usepipx
. If you are installing to a project, use the project’s dependency manager.
My workflow
My workflow when setting up a new laptop is as follows:
Install asdf
git clone https://github.com/asdf-vm/asdf.git ~/.asdf --branch <branch e.g. v0.8.0>
- add
source ${HOME}/.asdf/asdf.sh
to your shell - add python as a language:
asdf plugin add python
- install some python versions:
asdf install python 3.8.9 3.9.1
Install pipx
brew install pipx
- Set up pipx to use a fixed version of Python: set the environment
variable e.g.
PIPX_DEFAULT_PYTHON=$(asdf which python3.9)
Use pipx
to install poetry
pipx install poetry
Set up a Python project
poetry init
- Add dependencies:
poetry add ...
Using pipenv
Virtual environment version
If using pipenv
, you must be careful when pipenv
sets up the virtual environment for you. It is fairly greedy on which
python it picks to use, so I usually create the virtual environment
manually before running any other pipenv
commands.
In the example below, I set the local python version to 3.6.13 via
asdf
. I then create a pipenv
project, but it
chooses to use python 3.9.5 to create the virtual environment rather
than 3.6.13 as I specify.
$ asdf local python 3.6.13
$ python --version
Python 3.6.13
$ which python
/Users/<me>/.asdf/shims/python
$ asdf which python
/Users/<me>/.asdf/installs/python/3.6.13/bin/python
# This step uses python 3.9 to create the virtual environment
$ pipenv install
Creating a virtualenv for this project... HERE
Pipfile: /private/tmp/pipenvtesting/Pipfile ▼▼▼▼▼
Using /Users/<me>/.asdf/installs/python/3.9.5/bin/python3 (3.9.5) to create virtualenv...
⠼ Creating virtual environment...created virtual environment CPython3.9.5.final.0-64 in 245ms
creator CPython3Posix(dest=/Users/<me>/.local/share/virtualenvs/pipenvtesting-TqUem2-j, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/Users/<me>/Library/Application Support/virtualenv)
added seed packages: pip==21.0.1, setuptools==56.0.0, wheel==0.36.2
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
✔ Successfully created virtual environment!
Virtualenv location: /Users/<me>/.local/share/virtualenvs/pipenvtesting-TqUem2-j
Installing dependencies from Pipfile.lock (3cbc06)...
🐍 ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 0/0 — 00:00:00
To activate this project's virtualenv, run pipenv shell.
Alternatively, run a command inside the virtualenv with pipenv run.
This is done by setting the python version for the project with asdf:
asdf method local python 3.9.5
, and creating the pipenv
virtual environment: pipenv --python $(asdf which python)
. In
addition, it should be common practice to specify the python version in
the Pipfile
:
[requires]
python_version = "3.8"
If the version of python is not found with this configuration,
pipenv
will prompt you to install it, via either
asdf
or pyenv
.
Another common problem with pipenv
is locking packages with
Python >= 3.8, and trying to install the project with Python < 3.8. This
causes a confusing error:
ModuleNotFoundError: No module named 'importlib_metadata'
Python >= 3.8 includes importlib-metadata
as part of the standard
library. Therefore, locking the packages with a later version of Python
does not include this package in the list of dependencies.
Until the
__pypackages__
concept is more widely supported. ↩︎For example building a docker container, or lambda. ↩︎