Why use the Bazel build system?

20 Jul, 2022

Google's internal tool called Blaze was opensourced as Bazel in 2018. Their website does a pretty good job in explaining why you should use Bazel for your next project. But, that doesn't have to stop me from telling you why you should choose Bazel.

Now, Bazel is built on top of the following key principles:

Cryptographic hashing of all inputs - files, toolchain and environment
Running the actions without side-effects
Reusing outputs as long as inputs have not changed. This has its roots in functional programming. (Or so I'm told)

There are 2 important properties that make caching accurate:

side-effects: The key point above is (2). Running the actions without side-effects. This is where Bazel is different from other build systems like Scons/Gradle/Make etc. Bazel rules are written in a language called Starlark. This language makes it easy to keep track of side-effects of actions. In general additional plugins/rules created in Bazel generally do not have side-effects.

sandboxed-execution: Bazel also runs every action in a "sandbox". You can think of sandbox as being akin to a lightweight container. All the declared-inputs are mapped into the sandbox, and the generated outputs are closely monitored. This is done for each action. If a dependency is missed in the input specification - that's a build failure.

Because the lack of side-effects and sandboxed-execution – each action can basically treated as a mathematical function. The same inputs will always produce the same output. If the inputs haven't changed - you can use cached output. Bazel also comes with a remote-cache. Users can generally configure the CI to produce the cached objects from CI runs. This will be used for future CI builds and local developer builds too. Everyone gets incremental, correct builds.

Below are the various features of Bazel, compared to cmake/make

Remote caching and incremental builds

We talked about this a little bit earlier. While cmake/make/ninja support incremental builds based on disk caching, Bazel supports first-class remote caching, where (typically) the CI generates the cacheable objects. The cache is reused by all new/clean builds - be it a local developer or another CI run. This is a powerful technique that avoids a large swath of unnecessary build actions. Coupled with low-level action definition really makes each action to be cached. In Bazel terminology an action is the low-level execution of a process that takes in a specified set of inputs to product a specified output.

Note: ccache does support HTTP servers, they are treated as secondary citizens. Furthermore - without the "correctness" guarantees that come with strict sandboxing, relying on CI generated cache objects all the time become problematic. They lead to frequent nuke-ing of the cache.

Remote execution and wide farm

Bazel enables a local build to use a really wide (think ~500 CPU-wide) cluster to run build and test actions. This massive parallelism enables fast full-builds when necessary. Now, remote (or distributed) build execution is not really a new concept, distcc has been very popular in this area. But, distcc is limited to compile actions. Bazel can remotely execute any action - including build, link, unitest and scans. Furthermore the remote build server management overhead is significantly reduced with Bazel with native support for toolchains and platforms

Cloud Native technologies

Building on the point above of remote build server management. Due to the nature of the remote-apis defined by Bazel (and other related technologies like pants, please and recc - there is a clean delineation of duties for client and servers. This allows the servers to be bare-bones., since they typically do not have to carry any toolchain dependencies (all toolchain is expected to uploaded into CAS by the client). This allows an operator to scale up/down servers very easily - for example using kubernetes constructs. Furthermore, Bazel supports building remotely and locally inside a docker container. This allows for an application team to "freeze" parts (or all) of its toolchain and provide a consistent build/test experience locally and in CI.

Multi-platform and Cross-compilation support

Bazel supports generating binaries for multiple platforms and architectures. A developer can maintain all the different configuration options for different platforms in a single rule definition. For example, check out this cc_binary rule for tensorflow with different linkeropts for Mac, Windows and Linux. This allows a single definition to be used across multiple platforms. Also, since bazel allows fully specifying toolchains, you con configure it to use a cross-compiler (like linaro, minigw etc) instead of standard host compiler to cross-compile. See an example of bazel cross-compiling.

Coverage support

Bazel has first-class support for running coverage for unittests in Java/C++ and a few other popular languages. Coverage can also be run remotely, and results cached etc.

License support

Bazel supports downloading dependencies and ensuring that the license of the used package matches with defined firm-wide policies. These become low level rules to enforce allowed licenses that are compatible with a project type. They would be especially useful to provide early feedback to the user about what packages violate firm policies. See rules_license library for more information.

SBOM generation

Bazel requires complete specification of toolchain, and all the project dependencies to run any action. This is especially true for remote builds since the actions cannot rely on any tools being present locally in the remote server. This increases the startup time for a new project. But it also provides a full and correct software bill-of-materials. Bazel has support for enforcing allow-lists and deny-lists directly in developer workflow via rules_oss_audit.