Rock your sbt build time: Understand how sbt works

Triplequote mission is to make Scala code compile fast. Thanks to Hydra, the world’s only multicores Scala compiler, we delivered on this promise. Speedups are enticing and range from 2x up to 7x (see our case studies or the series of article about compiling Scala on different Amazon EC2 instances). This is great, but one nuance we realized over the years is that when developers are frustrated about slow compilation, they sometimes mean more than just compilation.

How can that be? The reason is that we never compile our code by invoking the Scala compiler directly. There is usually some other tool between us and the Scala compiler, and therefore compilation time is very often inflated by the specific tool at hand. In the Scala ecosystem, the tool we use to compile and build our projects is generally sbt (70%+ according to both the IntelliJ Scala 2019 survey and our State of Scala Compilation survey). Hence, understanding which inefficiencies are inherent to sbt and which depend on setup is key to minimize build times.

In this series of three articles about sbt we are going to cover the following aspects:

  • In this first instalment we lay the foundation to understand how sbt works. We cover what a setting is, and we build an intuition of why it takes long for sbt to load a build in memory (on large projects loading the build can easily take more than 30 seconds!). On the way, we learn about a few utilities that are readily available at our fingertips and that can brilliantly help us debug problems (say goodbye to println statements!).

  • In the second instalment we compare build loading times on the CI versus local environment, with the goal of finally understanding why they are different. Armed with this knowledge, we can optimize our CI build times and obtain massive speedups!

  • In the third and last instalment we cover what happens after the build is loaded in memory, and we execute a task such as compile. You’ll learn how to profile sbt and understand what other tasks are triggered in reaction to compile in your build (the emphasis on your is key, as every build is different!).

Let’s get started by exploring the building blocks of a sbt build.

What’s in a sbt build?

In its very essence, a sbt build is just a collection of settings (we refer to both sbt Task and Setting type as settings). Settings are usually contributed by plugins, either declared in the global or the project plugins folder. A third possibility is that a build defines new settings, but this is somewhat less common in practice.

So, what’s in a setting?

Settings 101

The job of a setting is simply to generate a value. To generate a value a setting may depend on other settings. Therefore, to evaluate a setting its dependencies need to be evaluated first.

But how can we discover the dependencies of a setting? And how can we quickly pinpoint where in the code a specific setting is defined?

Turns out it’s very simple to answer both of these questions!

Dependencies of a setting

The dependencies of a setting can be easily found by leveraging the built-in inspect tree command. Next is an example showing how to find the dependencies of the sourceDirectory setting.

sbt:root> inspect tree sourceDirectory
[info] sourceDirectory = src
[info]   +-baseDirectory =
[info]     +-thisProject = Project(id root, base: /projects/scalaworld, …)

Pretty neat!

Where is a setting defined?

Quite conventiently, the very same inspect command can also help us find out where a setting is defined in the code. Doing so can be extremely useful when debugging a build. Let’s see where the sourceDirectory setting is defined.

sbt:root> inspect sourceDirectory
...
[info] Defined at:
[info]     (sbt.Defaults.paths) Defaults.scala:328
...

We now know sourceDirectory is defined in the sbt/Defaults.scala source file. And it’s in good company, as most sbt settings are defined in the very same source.

By all means, the inspect command should be part of our sbt toolbox!

How many settings in a build?

Now that we know how to explore settings, we might be wondering how many settings are defined in a typical build. After all, we have the impression we use sbt just to compile, test or run the application, so they shouldn’t be too many, should they?

As a little experiment, I checked out a few open source projects that I believe are good representative of a typical commercial project build, to see how many settings are defined in their build. The first and second are built using the PlayFramework, while the third and forth are well known open-source projects. Here is the result:

project@git-hash settings
guardian/frontend@2ba8094 23793
ornicar/lila@0647f7f 29712
akka/akka@bc4c6ba 32047
circe/circe@8eb2cd5 37007

There are about 30k settings defined in these projects! The immediate follow-up question is why are there so many settings?

Before we delve into that, let’s have a brief detour and see how we can find how many settings are defined in our build.

Finding the settings size

When you start sbt, pay attention to its output. If there are more than 10k settings in your build the exact number is printed on screen.

$ sbt
[info] Loading ...
[info] Resolving key references (32047 settings) ...   // <-- Printed if 10k+ settings
...

But here I actually wanted to take the opportunity to advertize the sbt consoleProject task, which is unfortunately not widely known but can be a huge time saver when debugging a build. What does it do? It’s a Scala REPL with your build definition loaded in it. If you like using the Scala REPL for quick code exploration, you’ll love consoleProject.

For instance, we can use consoleProject to retrieve the settings size of your build in case it’s not printed by sbt on start. Mind that eval is akin to calling .value to trigger evaluation of a setting inside a build.

$ sbt
...
akka > consoleProject
[info] Starting scala interpreter…
...
scala> buildStructure.eval.settings.size
res0: Int = 32047

Next time you are thinking of adding a println in the build to debug an issue, don’t! Instead, query the sbt internal state thanks to consoleProject!

Now, let’s get back on track and build an intuition of why there are so many settings in a build.

Why so many settings?

Turns out there are many settings in addition to just compile, test, and run. In fact, even an empty build with no global nor project plugins has already around 700 settings defined in it! (if you don’t believe me you now know how to check this out on your own ;-)).

The number of settings in a build quickly grows with the number of plugins and projects defined in it. The ultimate reason for the explosion are the many scoping axes a setting can be bound to.

In fact, a setting can return a different value depending on the Configuration/Project/Task it’s bound to. You can visualize this as a 3D matrix or a cube.

For instance, the sourceDirectory for the Compile and Test scopes are different:

sbt:root> show root / Compile / compile / sourceDirectory
[info] .../src/main

sbt:root> show root / Test / compile / sourceDirectory
[info] .../src/test

But not all tasks makes sense for all scopes. As an example, the run task is never used on a library project, but it’s nevertheless always added to it.

The high degree of flexibility offered by sbt makes it easy to define settings. But this comes with a cost: the more settings we have, the longer it takes to load a build.

Get ready for more

While settings play an important role in the build load time, there is more to it. We are going to fully cover the build load time (and how you can optimize it!) in the second instalment of this series of articles about sbt. Subscribe to our newsletter now if you don’t want to miss it!

If you have a question about this article or if there are other aspects of sbt you are puzzled about, ask the question on Twitter by mentioning @triple_quote.

And while you wait for compilation to finish, why not giving Hydra a go ;-) It takes less than 5 mintues to set it up!