Debarshi's den

Toolbx is a release blocker for Fedora 39 onwards

with 2 comments

This is the second instalment of my 2023 retrospective series on Toolbx. 1

One very important thing that we did behind the scenes was to make Toolbx a release blocker for Fedora 39 and onwards. This means that the registry.fedoraproject.org/fedora-toolbox OCI image is considered a release-blocking deliverable, and there are release-blocking test criteria to ensure that the toolbox RPM is usable.

Why do that?

Earlier, there was no formal requirement for Toolbx to be usable when a new Fedora was released. That was a problem for a tool that’s so popular and provides something as fundamental as an interactive command line environment for software development and troubleshooting the host operating system. Everybody expects their CLI environment to just work even under very adverse conditions, and Toolbx should be no different. Except that Toolbx is slightly more complicated than running Bash or Z shell directly on the host OS, and, therefore, requires a bit more diligence.

Toolbx has two parts — an OCI image, which defaults to registry.fedoraproject.org/fedora-toolbox on Fedora hosts, and the toolbox RPM. The OCI image is pulled by the RPM to set up a containerized interactive CLI environment.

Let’s look at each separately.

The image

First, we wanted to ensure that there is an up to date fedora-toolbox OCI image published on registry.fedoraproject.org as a release-blocking deliverable at critical points in the development schedule, just like the installation ISOs for the Editions from download.fedoraproject.org. For example, when an upcoming Fedora release is branched from Rawhide, and for the Beta and Final releases.

One of the recurring complaints that we used to get were from users of Fedora Rawhide Toolbx containers, when Rawhide gets branched in preparation for the Beta for the next Fedora release. At this point, the previous Rawhide version becomes the Branched version, and the current Rawhide version increases by one. If the fedora-toolbox images aren’t part of the mass branching performed by Fedora Release Engineering, then someone has to quickly step in after they have finished to refresh the images to ensure consistency. This sort of ad hoc manual co-ordination rarely works, and it left users in the lurch.

With this change, the fedora-toolbox image is part of the nightly Fedora composes, and the branching is handled by Fedora Release Engineering just like any other release-blocking deliverable. This makes the image as readily available and updated as the fedora and fedora-minimal OCI images or any other deliverable, and we hope that it will improve the user experience for Rawhide Toolbx containers.

If someone installs the Fedora Beta or the Final on their host, and creates a Toolbx container using the default image, then, barring exceptions, the host and the container now have the same RPM versions for all packages. Just like Fedora Silverblue and Workstation are released with the same versions. This ensures greater consistency in terms of bug-fixes, features and pending updates.

In the past, this wasn’t the case and it led to occasional surprises. For example, the change to make RPM use a Sequoia based OpenPGP parser made it impossible to install third party RPMs in the fedora-toolbox image, even long after the actual bug was fixed.

The RPM

Second, we wanted to have release-blocking test criteria to ensure that the toolbox RPM is usable at critical points in the development schedule. This is to ensure that changes in the Toolbx stack, and future changes in other parts of the operating system do not break Toolbx — at least not for the Beta and Final releases. It’s good to have the fedora-toolbox image be more readily available and updated, but it’s better if Toolbx works more reliably as a whole.

Examples of changes in the Toolbx stack causing breakage can be FUSE preventing RPMs with file capabilities from being installed inside Toolbx containers, Toolbx bind mounts preventing RPMs with %attr() from being installed or causing systemd-tmpfiles(8) to throw errors, etc.. Examples of changes in other parts of the OS can be changes to Fedora’s Kerberos stack causing Kerberos to stop working inside Toolbx containers, changes to the sysctl(8) configuration breaking ping(8), changes in Mutter breaking graphical applications, etc..

The test criteria for the toolbox RPM also implicitly tests the fedora-toolbox image, and co-ordinates several disparate groups of developers to ensure that the containerized interactive command line Toolbx environments on Fedora are just as reliable as those running directly on the host OS.

Tooling changes

This does come with a significant tooling change that isn’t obvious at first. The fedora-toolbox OCI image is no longer defined as a layered image through a Container/Dockerfile. Instead, it’s built as a base image through Kickstarts and Pungi, just like the fedora and fedora-minimal images.

This was necessary because the nightly Fedora composes work with Kickstarts and Pungi, not Container/Dockerfiles. Moreover, building Fedora OCI images from a Dockerfile with fedpkg container-build uses an ancient unmaintained version of OpenShift Build Service that requires equally unmaintained ancient versions of Fedora to run, and the fedora-toolbox image was the only thing using Container/Dockerfiles in Fedora.

We either had to update the Fedora infrastructure to use OpenShift Build Service 2.x; or use Kickstarts and Pungi, which uses Image Factory, to build the fedora-toolbox image. We chose the latter, because updating the infrastructure would be a significant effort, and by using Kickstarts and Pungi we get to stay close to the fedora and fedora-minimal images and simplify the infrastructure.

The Fedora Flatpaks were also being built using the same ancient and unmaintained version of OpenShift Build Service, and they too are in the process being migrated. However, that’s outside the scope of this post.

One big benefit of fedora-toolbox not being a layered image based on top of the fedora image is that it removes the constant fight against the efforts to minimize the size of the latter. The fedora-toolbox image is designed for interactive command line use in long-lived containers, and not for deploying server-side applications and services in ephemeral ones. This means that dictionaries, documentation, locales, iconv converter modules, translations, etc. are more important than reducing the size of the images. Now that the image is built from scratch, it has full control over what goes into it.

Unfortunately, Image Factory is weakly maintained and setting it up on one’s local machine is a lot more complicated than using podman build. One can do scratch builds on the Fedora infrastructure with koji image-build --scratch, but only if they have been explicitly granted permissions, and then they have to download the tarball and use skopeo copy to place them in containers-storage so that Podman can see it. All that is again more complicated than doing a podman build.

Due to this difficulty of untangling the image build from the Fedora infrastructure, we haven’t published the sources of the fedora-toolbox image for recent Fedora versions upstream. We do have a fedora-toolbox:39 image defined through a Container/Dockerfile, but that was done purely as a contingency during the Fedora 39 development cycle.

This does degrade the developer experience of working on the fedora-toolbox image, but, given all the other advantages, we think that it’s worth it.

As of this writing, there’s a Fedora 40 Change to switch to using KIWI to build the OCI images, including fedora-toolbox, instead of Image Factory. KIWI seems more strongly maintained and a lot easier to set up locally, which is fantastic. So, it should be all rainbows and unicorns, once we soldier through another port of the fedora-toolbox image to a different tooling and source language.

Acknowledgements

Last but not the least, getting all this done on time required a good deal of co-ordination and help from several different individuals. I must thank Sumantro for leading the effort; Kevin, Tomáš and Samyak for all the infrastructure and release engineering work; and Adam and Kamil for all the testing and validation.

  1. Toolbx now offers built-in support for Arch Linux and Ubuntu ↩︎

Written by Debarshi Ray

1 March, 2024 at 13:44

2 Responses

Subscribe to comments with RSS.

  1. Some day I should try Toolbx, it would be useful currently for me to test Fedora 40’s compiler on a Fedora 39 system (I get bug reports about new problems found by the latest version of GCC).

    Except that Toolbx is slightly more complicated than running Bash or Z shell directly on the host OS, and, therefore, requires a bit more diligence.

    Running a software is always easier than implementing it, I would say 😉

    swilmet

    2 March, 2024 at 18:38

    • Yes, that’s exactly one of the use cases for Toolbx! :)

      What I meant is: if you consider ‘toolbox enter’ as your shell, then that’s more complicated than running Bash or Z shell directly on the host, because it’s now a combination of the host OS, Bash or Z shell, and then the whole Toolbx stack. That last piece involves a lot of moving pieces which need to be thoroughly tested.

      But, yes, I agree that a CLI shell is not a trivial piece of software. It’s definitely a significant beast in and of itself.

      Debarshi Ray

      5 March, 2024 at 11:40


Leave a comment