Date: 2021-09-20
This post is a followup to the previous "Build Steps" entry. Here I want to list out all of the pieces used to build the distro, say a few words about them, and provide links to see them in action.
Very brief, here is what we'll cover. This list is roughly in order from "small" tools to "larger" ones. Larger tools are abstractions on top of, and call the smaller tools. For example, Mock calls rpmbuild, so Mock is a "larger" tool. Mock is in turn called by Koji, so Koji is bigger than Mock. I hope that makes sense, it's the best sorting system I could think of!
Website : https://rpm.org
Source code : https://github.com/rpm-software-management/rpm
Written in : C
Usually called by : Mock
rpmbuild is a classic tool that's still in use today. Its job is simple: take a source RPM containing the "recipe" for building it, and compile it into a binary RPM. Source RPMs generally contain code, patches, and a .spec file describing version/dependencies/etc.
One of the weaknesses of traditional rpmbuild is the idea of "contaminated dependencies". Basically, the machine running rpmbuild is expected to have all the required RPMs already installed for the build. For example, if a source RPM requires automake
and g++
to build, then those packages better be installed! But a running server or desktop could have all kinds of extra packages installed that the code needs to compile, but may not be noticed in the RPM's .spec file.
Fortunately, we have a workaround: a way to create a temporary container, with only the bare minimum of RPM dependencies installed. We run rpmbuild inside the container, and can guarantee that we only have our dependent packages installed, and nothing else. This software is called Mock.
Website + Source Code : https://github.com/rpm-software-management/mock
Written in : Python
Usually called by : Koji (but can be run manually for local/test builds)
Mock is a tool for building RPM packages. Essentially, it creates a chroot
or systemd container, and installs a bare minimum bootstrap system inside that chroot. It then reads a source RPM's build-time requirements, and performs a dnf install
of those requirements inside the container.
Now that the container has the smallest possible amount of requirements installed, Mock performs an rpmbuild
against the source RPM inside the container, just as if it was being done on a regular system. Doing it this way we guarantee that only the necessary packages are installed, and we don't have to constantly install/uninstall dependencies on a build server to satisfy rpmbuild. We can do it all inside the minimal container, which then gets thrown away after we're done!
Mock is incredibly flexible via its config file, allowing the minimal bootstrap environment to be customized, pre- and post- actions to be run inside the container during build, and all kinds of things RPM-builders would be interested in. Check out its documentation for further info.
Source code : https://github.com/rocky-linux/srpmproc
Written in : Go
Usually called by : Distrobuild
Srpmproc is a tool developed by the Rocky Linux project, for our purposes. It has many options, but its core purpose is simple: Take source code for building RPMs from one repository, and put it in another. If the code/RPM spec from the original repository requires any patching or transformation, it does that as well, according to a standardized language. For the Rocky Linux project, srpmproc
pulls sources from git.centos.org and imports them into git.rockylinux.org
This is a little more complicated than it sounds, as the program has to contend with things like branch name translation, non-trivial patch operations during import, and the import of source tarballs not stored with the main git repository. Srpmproc will feature more prominently in my next article, which covers source code import, patching, and debranding.
Source code : https://pagure.io/koji/
Home Page : https://fedoraproject.org/wiki/Koji
Written in : Python
Usually called by : Distrobuild, Human user in web browser or Koji API
Rocky Live Instance : https://koji.rockylinux.org/koji/
Koji is a centralized build system for RPM packages. It integrates several components, including a web interface, API, internal yum repositories, and distributed builder hosts to do the actual package building. Other programs (like Distrobuild) generally query or control Koji via its robust API and Kerberos authentication system.
Koji "Hosts" are where the actual compilation is done. They simply check for instructions from the Kojihub instance, and execute Mock/rpmbuild with the appropriate options enabled to perform the package build. We have build hosts for each processor architecture that Rocky supports. Currently that means we have x86_64 and aarch64 (ARM 64-bit), and PPC64 (PowerPC) is likely on the way soon. (fingers crossed!)
Builds in Koji are organized according to targets and tags. A package build is associated with a target, while the resulting package is associated with a tag. This allows us to organize packages according to release, generally by minor version of the distro. For example, Rocky 8.4 updates (the current minor release), builds are done against the dist-rocky8_4-updates target, and package artifacts are tagged with the dist-rocky8_4-updates-build tag when builds complete. There may be a future article where we dive into Koji targets, tags, repositories, and how they work together.
Source code / home page : https://pagure.io/fm-orchestrator
Written in : Python
Usually called by : Distrobuild
Intro to RPM Package Modules : https://docs.fedoraproject.org/en-US/modularity/
MBS is the Module Build Service, a way to build modular streams in Fedora and other RPM-based distros, including RHEL (and therefore Rocky!).
Quick crash course: Modularity is a way to get multiple versions of the same software available for install in an RPM-based distro. For example, in RHEL+Rocky, the PHP language has 3 different versions available for installation: 7.2, 7.3, and 7.4, with the default being 7.2. You can flip between versions by enabling the proper version using the dnf module
subcommand. Then, when you dnf install php
, DNF will "see" only the enabled version and its related packages. You can get a quick summary of all the module versions available to you with: dnf module list
. Try it and see what's there!
Modules are a really cool feature, but they do place an extra complexity on development. Collections of module packages must be compiled extremely carefully, because they will need different chains of dependencies based on which version is being built. PHP 7.2 cannot have the same dependency chain as PHP 7.4! They are simply too different. We have to carefully control this at build-time, which is where the Module Build Service comes in.
MBS is called via its own API. Its primary purpose is to talk to Koji, and submit module builds in groups with special settings according to that module's needs. For example, certain modules might need special RPM build macros assigned, or depend on specific versions of other modules. And all of that must be controlled to ensure the packages within the module are built correctly and have a dependency chain that will work properly for end users.
Modularity in general is a big topic, and will get its own article in this series. It's important to know that MBS has its own API, and acts as a special client to Koji. It is concerned with organizing and submitting modular package builds.
Source code / home page : https://pagure.io/sigul
Written in : Python
Usually called by : Distrobuild
Sigul is the Fedora project's signing system for packages, and Rocky Linux uses it as well. It has an API "bridge" that lives on a listener server, which accepts logins and signing requests. The bridge server then talks on a private network to a private signing/master server, which has the private release keys that actually sign the packages. As mentioned before, we sign packages with our private key so end-users can be mathematically certain that the packages they download actually came from the Rocky Linux project.
Sigul can be called manually, but generally it is part of the automatic package build process orchestrated by Distrobuild. First, Koji (and sometimes MBS) are called to execute the actual source checkout, compilation, etc. Then Distrobuild is able to grab the freshly created packages from Koji, automatically submit them to Sigul for signing, and Sigul produces a package cryptographic signature. That signature is then imported into Koji, and Koji will combine the signature and package file to produce a signed package.
Obviously Sigul is quite powerful, but our use case is fairly simple. Submit unsigned RPM as input, receive a signature back as output. Then instruct Koji to combine signature and RPM, and put the resulting signed file into the appropriate place where the release process can continue. High availability and capacity are important for our Sigul installation - we sign a LOT of RPMs!
Source code / home page : https://pagure.io/pungi
Written in : Python
Rocky 8 Pungi Config : https://git.rockylinux.org/rocky/pungi-rocky/-/tree/r8
Usually called by : Rocky Linux Release Engineer
Pungi is a distribution "compose" tool. It creates (composes) both our official repositories (BaseOS, AppStream, etc.), as well as the bootable Rocky Linux disk images (ISOs). It is a non-trivial tool, and has several sub-modules for doing all of these different tasks.
Its primary purpose is to read a manifest of packages that belong in the official repositories. It connects to Koji, plucks those select packages out of the build system, and puts them in a staging area where repositories are built. Pungi also creates the Rocky ISO installation media according to a list of packages as well. It specifies what gets included in each ISO image (Everything, Minimal, Boot).
When packages are produced, not all of them make it into the final repositories. Debug packages, many -devel packages, and others are not included, and Pungi helps us filter what stays and what goes. Ultimately, Rocky Linux strives to match the RHEL composes package-for-package, version-for-version. This means that a lot of packages which we'd like to include get built, but ultimately left out of the official distribution. We try to make these packages available through special -devel repositories. They are always downloadable from Koji (or a Koji mirror) as a last resort, we want to make as much software available as possible!
Rocky makes our Pungi configs available in Git, via the link above.
Source code : https://github.com/rocky-linux/distrobuild
Written in : Python
Usually called by : Rocky Linux Release Engineer
Rocky Live Instance : https://distrobuildstg.rockylinux.org/
Distrobuild is a Rocky Linux-specific tool, used primarily as a single web UI that makes interacting with several of these disparate tools much easier. From Distrobuild, Rocky devs can import (and debrand) source code, launch package builds ("normal" packages or modules!), sign packages, and view history of a package's builds/imports.
It serves web content in its own right, but also acts as a client to several of these components: Koji (builds), MBS (module builds), Sigul (package signing). It also calls Srpmproc to import sources from RHEL and run automated debranding/patches en masse.
The idea was to have a single point which could automate much of the nuts-and-bolts of the build process, and look half-way decent while doing it. Koji's web interface is a logical place to do this, but it's a bit clunky and just doesn't integrate well with other pieces of build infrastructure, particularly module builds.
In the spirit of openness and full disclosure, Rocky's Distrobuild instance (perhaps the only one in the world?) is available publicly for browsing. You can see logs of all source imports, status of all package builds, and essentially view a history of every build the Rocky team has ever done up to this point. Pretty neat!
RabbitMQ : https://www.rabbitmq.com/
Celery : https://docs.celeryproject.org/en/stable/
Fedmsg : https://github.com/fedora-infra/fedmsg
These 3 are all queue/task systems used by other pieces of the build system. Koji prefers the (dated) fedmsg (Fedora messaging system), while other pieces use a combination of the Celery/RabbitMQ systems.
The details of these are important, but can't be covered in-depth here. The important thing is that different pieces of build infrastructure can reliably talk to each other via these queue systems. Messages can be added to a queue to assign tasks, and consumed from the queue to execute those tasks.
The next topic I'll cover in this journey is source code management. We will see in-depth how (and where!) the sources are stored, how things are organized, and discover why exactly it's done this way. There will be lots of Git. Should be fun!
Thanks for reading.
-Skip