December 11, 2023

Create a historic Ubuntu package mirror with apt-mirror and zfs

Create a historic Ubuntu package mirror with apt-mirror and zfs

If you're running Ubuntu or Debian Docker containers you're most likely installing packages through `apt`:

```
RUN apt update && apt install -y wget curl openssl git
```

One thing that surprises many people is that this will install the latest available versions of these packages, as the Ubuntu package registry is constantly updating. So which version will be installed in your container is dependent on when the container was built, and what was in your build cache. This means that containers built from the same Dockerfile (e.g. locally and on production) might have completely different versions of dependencies. And, your dependency list can change any time the container gets rebuilt - potentially breaking your builds.

Your initial response might be to try and pin to specific package versions. E.g.:

```
RUN apt update && apt install -y curl=7.68.0-1ubuntu2.20
```

While it seems that might work at first glance, this will break your build pretty soon; as the registry actively deletes older versions. And, packages might also just disappear. For example, Chromium used to be available through `apt` but was removed in favor of the snap version (if this broke your build, see Install Chromium in an Ubuntu container).

An immutable Ubuntu package registry

What you'd like is an immutable Ubuntu package registry. One that's frozen in time. So when you install a dependency you'll always get the exact same package and version back. That doesn't mean that you'll never update your dependencies (it's great to get security patches), but then you can do this when you're ready to update (and run proper tests to see that nothing breaks). Here's how we could set that up:

  1. Mirror the current Ubuntu package registry to a server via apt-mirror.
  2. Create an immutable snapshot of the registry using zfs.
  3. Serve the snapshot through a web server.
  4. Repeat daily.

Now you can add the snapshot as your package registry, which will be immutable. To set this all up, see:

https://github.com/stablebuild/historic-ubuntu-package-registry

Afterwards you'll have a daily snapshot of the complete registry and you'll have a stable package list:

Snapshot of the Ubuntu package registry showing curl

OK, great - but now I have another server...

Running your own registry works really well, but comes with downsides:

  1. It's extra infrastructure to manage. Someone needs to maintain the server and troubleshoot if something goes wrong. That's extra load on your infra team.
  2. Once you integrate the mirror in your build system it becomes a critical part of your build infra. If the mirror server is down you won't be able to build new containers.
  3. It's expensive - at least when hosting through the major cloud providers:

    * Throughput optimized hard disks (not even SSDs) are $0.045 / GB / month at AWS. At 6TB provisioned that's 270$ / month, just for storage.
    * You most likely want multiple servers to not have a single point of failure.
    * On top of server cost you'll also be charged for traffic. Egress prices at the major cloud providers are high. F.e. pricing from CloudFront to the internet at AWS is between 0.085$ and 0.17$ / GB depending on the region, but you'll also need intra-region traffic (to go from mirror server to CloudFront, min. 0.02$ per GB); you'll be charged an additional fee per request; and you'll need to pay for a load balancer.

Can't we do better?

Introducing StableBuild

Yes! To properly fix this problem we've built StableBuild. At StableBuild we mirror the Ubuntu and Debian package registry - plus the most popular PPAs - daily, so you get the benefits of using an immutable registry without having to manage servers yourself. On top of that you'll get access to other tools to make your builds stable and deterministic, like our Docker and PyPI mirrors.

Want to try it out? Get started for free at https://dashboard.stablebuild.com .