Book Club: The DevOps Handbook (Chapter 12. Automate and Enable Low-Risk Releases)

This entry is part [part not set] of 25 in the series DevOps Handbook

The following is a chapter summary for “The DevOps Handbook” by Gene Kim, Jez Humble, John Willis, and Patrick DeBois for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Background on The DevOps Handbook

More than ever, the effective management of technology is critical for business competitiveness. For decades, technology leaders have struggled to balance agility, reliability, and security. The consequences of failure have never been greater―whether it’s the healthcare.gov debacle, cardholder data breaches, or missing the boat with Big Data in the cloud.

And yet, high performers using DevOps principles, such as Google, Amazon, Facebook, Etsy, and Netflix, are routinely and reliably deploying code into production hundreds, or even thousands, of times per day.

Following in the footsteps of The Phoenix Project, The DevOps Handbook shows leaders how to replicate these incredible outcomes, by showing how to integrate Product Management, Development, QA, IT Operations, and Information Security to elevate your company and win in the marketplace.

The DevOps Handbook

Chapter 12

Kent Beck, the creator of the Extreme Programming methodology, one of the leading proponents of Test Driven Development, and technical coach at Facebook, provided details on their code release strategy at Facebook.

“Chuck Rossi made the observation that there seem to be a fixed number of changes Facebook can handle in one deployment. If we want more changes, we need more deployments. This has led to a steady increase in deployment pace over the past five years, from weekly to daily to thrice daily deployments of our PHP code and from six to four to two-week cycles for deploying our mobile apps. This improvement has been driven primarily by the release engineering team.”

Kent Beck, The DevOps Handbook

Automation The Deployment Process

Some recommended good practices for automating the deployment process include:

  • Packaging code in ways suitable for deployment
  • Creating pre-configured virtual machine images or containers
  • Automating the deployment and configuration of middleware
  • Copying packages or files onto production servers
  • Restarting servers, applications, or services
  • Generating configuration files from templates
  • Running automated smoke tests to make sure the system is working and correctly configured
  • Running testing procedures
  • Scripting and automating database migrations

Deployment Pipeline Requirements:

Deploying the same way to every environment: By using the same deployment mechanism for every environment, production deployments are likely to be far more successful since the team knows that it’s been successfully deployed many times already earlier in the pipeline.

Smoke testing deployments: During the deployment process, test connections to any supporting systems (e.g., databases, message buses, external services) and run a single test “transaction” through the system to ensure that the system is performing as designed. If any of these tests fail, the deployment should be failed.

Ensure consistent environments: In previous steps, the team created a single-step environment build process so that the development, test, and production environments had a common build mechanism. The team must continually ensure that these environments remain synchronized.

Adopted from The DevOps Handbook

Enable Automated Service Deployments

Tim Tischler, Director of Operations Automation at Nike, describes the common experience of a generation of developers: “As a developer, there has never been a more satisfying point in my career than when I wrote the code, when I pushed the button to deploy it, when I could see the production metrics confirm that it actually worked in production, and when I could fix it myself if it didn’t.”

Puppet Labs 2013 State of DevOps Report, which surveyed over four thousand technology professionals, found that there was no statistically significant difference in the change success rates between organizations where Development deployed code and those where Operations deployed code.

Changes to Deployment Strategy:

Build: The deployment pipeline must create packages from version control that can be deployed to any environment, including production.

Test: Anyone should be able to run any or all of our automated test suite on their workstation or on test systems.

Deploy: Anybody should be able to deploy these packages to any environment.

Integrate Code Deployments Into The Deployment Pipeline

Deployment automation must provide the following capabilities:

  • Ensure packages created during the continuous integration process are suitable for deployment into production
  • Show the readiness of production environments at a glance
  • Provide a push-button, self-service method for any suitable version of the packaged code to be deployed into production
  • Record automatically — for auditing and compliance purposes — which commands were run on which machines when, who authorized it, and what the output was
  • Run a smoke test to ensure the system is operating correctly and the configuration settings — including items such as database connection strings — are correct
  • Provide fast feedback for the deployer so they can quickly determine whether their deployment was successful
Adopted from The DevOps Handbook

Etsy provided solid insight into the state of their deployment process including capabilities.

The goal at Etsy has been to make it easy and safe to deploy into production with the fewest number of steps and the least amount of ceremony. For instance, deployments are performed by anyone who wants to perform a deployment. Engineers who want to deploy their code first go to a chat room, where engineers add themselves to the deploy queue, see the deployment activity in progress, see who else is in the queue, broadcast their activities, and get help from other engineers when they need it. They execute 4,500 unit tests locally and all external calls have been stubbed out. After they check-in their changes to trunk in version control, over seven thousand automated trunk tests are instantly run on their continuous integration (CI) servers.

Decouple Deployments From Releases

Deployment is the installation of a specified version of software to a given environment (e.g., deploying code into an integration test environment or deploying code into production). Specifically, a deployment may or may not be associated with a release of a feature to customers.

Release is when the team makes a feature available to all our customers or a segment of customers. The code and environments should be architected in such a way that the release of functionality does not require changing our application code.

There are two broad categories of release patterns:

Environment-based Release Patterns: two or more environments to deploy into, but only one environment is receiving live customer traffic. New code is deployed into a non-live environment, and the release is performed moving traffic to this environment. These patterns include blue-green deployments, canary releases, and cluster immune systems.

Application-based Release Patterns: modify the application to selectively release and expose specific application functionality by small configuration changes. For instance, feature flags can progressively expose new functionality in production to the development team, all internal employees, 1% of the customers, or the entire customer base. This enables a technique called dark launching, where all the functionality to be launched is staged in production and is tested with production traffic before the release.

Environment-based Release Patterns

The simplest of the three patterns is called blue-green deployment. In this pattern, there are two production environments: blue and green. At any time, only one of these is serving customer traffic. The benefits are enabling the team to perform deployments during normal business hours and conduct simple changeovers.

Adopted from The DevOps Handbook

To implement the pattern, create two databases (blue and green database): Each version, blue (old) and green (new), has its own database. During the release, the blue database is put into read-only mode, followed by a backup operation, then a restore of the green database, and finally switch traffic to the green environment.

Additionally the team must decouple database changes from application changes. Instead of supporting two databases, the team decouples the release of database changes from the release of application changes by doing two things: (1) make only additive changes to the database, which means never mutating existing database objects; and, (2) make no assumptions in the application about which database version will be in production.

The canary release pattern automates the release process of promoting to successively larger and more critical environments as the team confirms the code is operating as designed. The term canary release comes from the tradition of coal miners bringing caged canaries into mines to provide early detection of toxic levels of carbon monoxide. If there was too much gas in the cave, it would kill the canaries before it killed the miners, alerting them to evacuate.

Adopted from The DevOps Handbook

For the above diagram:

  • A1 group: Production servers that only serve internal employees.
  • A2 group: Production servers that only serve a small percentage of customers and are deployed when certain acceptance criteria have been met (either automated or manual).
  • A3 group: The rest of the production servers, which are deployed after the software running in the A2 cluster meets certain acceptance criteria.

There are two benefits to this type of safeguard: (1) the team protects against defects that are hard to find through automated tests; and, (2) the time required to detect and respond to the degraded performance by the change is reduced.

Application-based Patterns To Enable Safer Releases

Feature Toggles benefits:

  • Easy Roll Back – features that create problems or interruptions in production can be quickly and safely disabled by merely changing the feature toggle setting.
  • Gracefully degrade performance – when service experiences extremely high loads that would normally require an increase in capacity or risk having our service fail in production, feature toggles can reduce the quality of service.
  • Increase resilience through a service-oriented architecture – If a feature relies on another service that isn’t complete yet, the team can still deploy our feature into production but hide it behind a feature toggle. When that service finally becomes available, then toggle the feature on.

Feature toggles allow features to be deployed into production without making them accessible to users, enabling a technique known as dark launching.

Dark Launch benefits:

  • Deploy all the functionality into production and then perform testing of that functionality while it’s still invisible to customers.
  • Safely simulate production-like loads, providing confidence that the service will perform as expected.
Series Navigation

Leave a Reply

Discover more from Red Green Refactor

Subscribe now to keep reading and get access to the full archive.

Continue reading