Only Build Your Binaries Once

06 Jan 2023 - Giulio Vian - ~7 Minutes

One of the principles highlighted, many years ago, in Continuous Delivery is: Only Build Your Binaries Once (Chap.5 p.113).

As all good principles, its formulation is simple but its ramifications are numerous. Today I’ll try to explain the consequences and clarify the reasons that justify the principle of generating the binaries only once.

Before going into it, we need to clarify that the term “binaries”. Both in the context of the original authors and also in this article ‘binaries’ includes all kinds of artifacts produced by continuous integration (i.e. the build) and specifically the installation (or distribution) packages.

Violations

Let’s start the journey: what mean the authors with the expression “one time". The easiest way to understand this, is to consider the opposite scenario, where we build binaries more than once, each time for specific use and destination.

A primary example of principle’s violations, I observed, is in organizations which release versions belonging to different source branches. These groups manage sources using long-lived branches where each branch matches a release environments (e.g. development, integration, acceptance, production). To release on an environment, the version of the matching branch is pulled, compiled and finally deployed on the target environment. Promoting a version is done by a merge operation, from a “lower” branch to the “higher” target.

Another example of a violation happens with some client JavaScript frameworks which embed environment-specific values in the deployment package. More precisely, it’s not the framework itself but the surrounding tools, in particular the tool that produces the package to be distributed. If you try to change these values directly in the package, you break the hash control of the file and the application will become unusable. An additional problem of these JavaScript frameworks is specifying the framework version, for example React has react.production.min.js and react.development.js; HTML pages explicitly reference either version.

Test invalidation

The approaches described clearly violate the principle of generating the package only once. The first and foremost problem they create is invalidating tests. If I test a version in one environment, say integration, but I try in the next environment, e.g. staging, a different version, built using different sources, what kind of guarantee do I have that the tests carried out in the integration environment are still valid and meaningful for staging?

Someone may object that not all changes invalidate the tests, but you have to be 100% certain about it, even better if you can offer a mathematically demonstration. Someone else could argue that, paying particular attention, it is possible to demonstrate that the change promoted via merge is the same at all promotion levels. To this objection I reply that, yes, the sources are identical, but it is not possible to guarantee that the tools used to generate the binaries, and therefore the installation package, are identical and produce the same result.

Instead, I observe that the question does not depend on the number of environments and promotion levels, but only on bringing into production binaries other than those tested. Do I need to remind you that only production ever counts? If I deploy to production something that wasn’t really tested, because it’s a different version, I’m taking a serious risk that something won’t work - it’s a leap in the dark.

Merge? No thanks

Source-promotion strategies all too often result in huge merge-sets: lots of changes and lots of conflicting files. These situations involve an insane expenditure of energy to resolve all the drawbacks, with a high probability of error.

Experience demonstrates that the smaller a change, the greater the chances of a successful merge, to the point that it is automatically completed by the version control tools. This is precisely the Lean principle of small batches, i.e. minimizing the amount of changes contained in a release. Source changes sits at the beginning of the pipeline, if we are not able to refrain ourselves at this stage, it will be a lot more difficult at later stages.

Other reasons

As explained, invalidating tests is the main reason to avoid generating binaries not targeting production, we have to examine a few additional reasons to stay away from environment-bound releases. These additional reasons are different types of risk.

The first risk is the possibility of releasing the wrong version in the wrong environment: in the luckiest case the system won’t start, in the worst we may run into subtle but devastating bugs. Additional risks are found on the security and privacy front. Releasing the debug version into production gives attackers easier terrain, for example thanks to a simpler format of call frames. As for privacy, in debug all memory is accessible without filters and the risk of exfiltrating sensitive data (Personally identifiable information or PII) is greater.

Each platform and language offers two modes for compiling or translating, usually labelled debug and release. Code in debug mode is less efficient than code optimized for production. It is crucial valid in pre-production the optimized version and not the debug version: the behaviour changes in a subtle and hardly predictable way. For example an operation may expire (timeout) in debug with a certain frequency, while in release the frequency drops drastically.

Deploy

Up to this point we focused on the biggest slice of the release package, namely the application (or component) binaries. Part of the package, or equivalently in the pipeline, are also the release and installation scripts. These are subject to the same principle: use the same scripts in each and every environment, ensuring the consistency of the deployment process, as much as possible.

Someone will argue that this is impossible: a test environment does not have the distributed high reliability (HA) and disaster recovery (DR) infrastructure like a production environment, so, at a minimum, the scripts will have code sections conditioned by the type of environment. To them I reply that conditionals should be avoided as much as possible: the majority of IF statements can be easily replaced by looping on a list. In addition, at least one test environment (e.g. staging) must reflect the production HA/DR topology, usually scaled down to save money. Such an environment allows you to validate that the scripts works correctly without no compromises, resulting in a greater serenity for the production deployment.

Collateral

I hope I was clear and exhaustive about the reasons supporting the principle of building distribution packages only once; now, let’s examine some effects of its application.

One benefit is saving computational resources used by pipelines; it is typically balanced by an increase in storage (disk) consumption. This saving is evident in the scenario of one source branch per environment. Someone my think that storing binaries is useless when adopting the reproducible build technique. With reproducible build we are always sure of reproducing the exact same executables at any time. In theory. In practice, if we have to go back in time, we may find ourselves unable to use the same pipeline, with the same compilers and libraries used originally, unless we store every version of the toolchain and libraries. A much bigger task compared to saving the artifacts. Maintaining the packages produced by the build is much easier and also satisfies possible audit and security demands.

At build time, we have a lot of (meta-)data about the distribution package and its contents. In particular, we can generate a Software Bill of Materials (SBOM), listing all the files to be distributed, their dependencies and, above all, the cryptographic hash of all the files. When saved in a safe place, it allows you to check whether the files in production have been altered by an attack.

Exceptions

The exception, they say, proves the rule, so for the principle of building binaries once and for all, there is one exception that is signing the binaries or package. For example, mobile applications, like Android APK or Apple IPA, must be signed to be published on a Store.

Usually the cryptographic keys for the official signature are kept in a safe place, hopefully protected by a hardware security module (HSM), and made available to a pipeline with privileges, subject to a specific authorization and often a manual approval. A normal build cannot access these keys.

The approach I recommend is to use a single script for all builds; the initial stage produces unsigned binaries and it is followed by two more stages, a non-privileged stage which signs using a key internal to the dev/test group, and the last stage that use the official signing key on the neutral binaries. Keep a strict control on circulation of internally signed binaries: the development signature should never appear in public, e.g. on Official Stores. If the internal key is stolen, we users might be tricked into using fake versions of the software signed with a valid but unofficial key.

What do you think? Am I wrong? Let me know in the comments.

Architettura
DevOps CI/CD Automation Pipelines