Packaging Python Apps

by Morgan Shorter on Mon, 25 Apr 2022

I have never quite understood why Python (or Ruby) packages are delivered through their own manager (pip, gem) instead of the local system package manager (apt, yum, etc). It might make sense for pure language packages but it becomes borderline insane when dealing with bindings to native libraries.

Side-stepping the OS distribution creates tons of problems especially when you rely on something with some very problematic design issues like easy_install. If you are interested in packaging software history, here is a great post about the subject.

Python bindings, which package manager to choose?

Fortunately 10 years later, these problems have been recognized and work on in the Python community. Pip supports pre-built binaries now, though most packages are built for the manylinux target which links against glibc. That has important implications when for various reasons (including the fact that DjaoDjin is a PaaS to run Web applications as Docker containers), we are packaging Python applications as Docker images.

The official Python Docker images come in multiple variants:

  • python:<version> Debian-based with common packages
  • python:<version>-slim Debian-based with minimal packages needed to run python
  • python:<version>-alpine Alpine-based when small images is a primary concern
  • python:<version>-windowsservercore Windows-based, because well...

First caveat: glibc vs. musl

Alpine is built against musl not glibc as most distribution, Debian included, are. Musl imposes multiple constraints, like stack size limits which some users have reported issues with.

There is a new PEP for wheels to be built against musllibc and to be distributed under the musllinux tag. This is currently not as widely adopted as the manylinux tag. Most crucially, the cffi package does not have musllinux builds yet (Apr 2022). This means that installing compiled-language packages for a musl linked python, even if they have a musllinux build, requires a compilation step for cffi and makes it somewhat impractical to use python binaries built against musl.

On the other hand, contrary to Debian, Alpine's package manager, apk, has features which allow for temporary / virtual packages. This makes it easier to build software from source without leaving gcc and other build-time prerequisites behind in docker images. This feature doesn't get us out of needing time and resources to build objects, but does keep resultant docker images free from the security and disk-usage impacts of keeping build-time dependencies.

Practically, for any application that requires Python bindings to native code because, for example, it generates charge receipts (WeasyPrint), or processes images (ImageMagick), or uses the PostgresQL bindings (psycopg2), or uses cryptography (pyca/cryptography) features that only leaves python:<version> and python:<version>-slim as alternatives at this point.

Second caveat: /usr/local/bin/python

In the official python docker images, python is built from source and installed in /usr/local. This means a few important things:

  1. Installing Python packages via the OS package manager might install a copy of python and it's dependencies, which may be incompatible with the python version that was built from source for the official image. Using the OS package manager will bloat the image with with a huge graph of (likely irrelevant) dependencies. It is also less deterministic and more likely to result in run-time errors.
  2. It is possible (likely) to have a mismatch in version between the python version built from source and the version installed by the OS.

Writing a packaging policy

Given the previous analysis, we standardize Dockerfile as such:

  • use an official `python:3.X-slim` base image
  • prefer pip over the OS package manager for Python bindings
  • install pure native code libraries through the OS package manager

More to read

You might also like to read:

More technical posts are also available on the DjaoDjin blog, as well as business lessons we learned running a SaaS application hosting platform.

by Morgan Shorter on Mon, 25 Apr 2022

Bring fully-featured SaaS products to production faster.

Follow us on