Technology•Apr 11, 2018
3 Bad Continuous Integration Smells Part 2: Multiple Builds in the Pipeline
In the last blog post in this series I reviewed one of the common shortcomings immediately seen when companies ask us to evaluate their continuous integration (CI) practices—manually invoking the build.
This time, I’ll review another “smell” I’ve encountered that can reduce the velocity and quality of your team’s output, and I’ll offer specific suggestions to keep your team humming along.
Smell 2: You Build Your Application Multiple Times in the Pipeline
Let’s assume your team is responsible for developing a Java-based web application that’s packaged into an executable JAR to be run in a remote environment. Ideally, each commit to trunk will result in:
An automated invocation of the CI pipeline.
Packaging of the application into a JAR.
Successful execution of a series of quality tests.
An upload of the JAR to a centralized repository such as Nexus or Artifactory.
If the JAR needs to be deployed to one or more environments—up to and including the production environment—it’s pulled from the centralized repository.
A key thing to note is that each point in the application’s commit history should produce at most one execution of the build that results in one JAR being uploaded to the centralized artifact repository. That means the built JAR needs to be agnostic to the environment to which it will be deployed. Some teams, however, build their applications multiple times throughout the pipeline. Such a process is both inefficient and accompanied by various pitfalls.
Why Teams Build More Than Once
The most common reasons I’ve seen for invoking multiple builds throughout a single CI pipeline have to do with the constraints imposed by the chosen environment configuration and branch management strategies. Let’s review both.
1. Environment specific configuration or generated code is bound to the build artifact.
This applies to your team if the build step in your CI pipeline must know which environment your application will be deployed to and uses that knowledge to produce a build uniquely suited to that environment by packaging generated code or environment variables into the build that cannot be overridden at runtime. For example, given this scenario, if your team has test and production environments, you would have two builds—one for each environment.
This is, by definition, inefficient since the build process will be invoked more than once. Moreover, if configuration is bound to the built application, any configuration change will require a new build and deployment. There’s also the added risk that the wrong build is deployed to the target environment (e.g., deploying the test build to the production environment).
Much of the time, the solution is to externalize and inject the environment-specific configuration needed by the application at run time using environment variables, start up arguments, or centralized configuration.
The ideal solution is to have a persistent, versioned, externalized configuration mechanism that provides the capability to modify the runtime behavior of the application dynamically. This eliminates the building of applications bound to an environment and provides the functional basis for toggles to control the behavior of the application once it’s in operation, all without having to rebuild or re-deploy the application. In the world of Java, Spring Cloud Config and Archaius provide this type of useful functionality, or you can roll your own centralized configuration suitable to your needs.
2. Your team uses a branch-per-environment strategy for managing releases.
In this case, teams create a branch for each target runtime environment, and build from each branch when they are ready to deploy to its respective environment. For instance, if the team has integration, staging, and production environments, their code repository will contain corresponding branches with the same names. Developers will initially commit and build their code from the integration branch, and deploy the resulting artifact to the integration environment. Next, code is merged from the integration branch into the staging branch, and the build and deployment process is invoked again. Finally, code is merged from staging into the production branch, built, and deployed to the production environment.
I’ve mostly seen this approach used as a means of organizing code and environments for concurrent development of consecutive releases. But it quickly breaks down when the order of releases is changed or unknown and the team struggles to understand how to manage merging between the branches.
On top of that, not only do we assume the overhead of redundant builds with this approach, we absorb all of the aforementioned downsides of not doing continuous integration. This is not a recipe for speed, agility, or quality.
In the next blog post, I’ll review the underlying problems that make distinguishing the functional differences between your build artifacts difficult. In the meantime, if you have any questions or would like help evaluating the state of your CI pipeline, drop us a line at firstname.lastname@example.org.