3.7 Generic Processes
We can also make processes more generic to improve efficiency. For example, onboarding new employees is a complex process. In some companies each division or team has a different onboarding process. In some places engineers have a different onboarding process than non-engineers. Each of these processes is pet-like. It takes extra effort to reinvent each process again and again. Improvements made for one process may not propagate to the others. However, this situation often arises because different teams or departments do not communicate.
In contrast, some companies have a unified onboarding process. The common aspects such as paperwork and new employee training are done first. The variations required for different departments or roles are saved to the end. You would think this is a no-brainer and every company would do this, but you’d be surprised at how many companies, both large and small, have a pet-like onboarding process, or unified the process only after losing money due to a compliance failure that required the company to clean up its act.
Another example is the process for launching and updating applications in production. Large companies often have hundreds of internal and external applications. A retailer like Target has thousands of applications ranging from inventory management to shipping and logistics, forecasting, electronic data interchange (EDI), and the software that handles the surprisingly complex task of generating price labels.
In many organizations each such application has been built using different software technologies, languages, and frameworks. Some are written in Java; others in Go, Python, or PHP. One requires a particular web framework. Another requires a particular version of an operating system. One requires a certain OS patch; another won’t work on a machine with that patch. Some are delivered as an installable package; with others the developer emails a ZIP file to the system administrators.
As a result the process of deploying these applications in production is very complex. Each new software release requires the operations team to follow a unique or bespoke process. In some cases the process is full of new and different surprises each time, often based on which developer led that particular release. Joe sends ZIP files; Mary sends RAR files. Each variation requires additional work and additional knowledge, and adds complexity and risk to the production environment. Each variation makes automation more difficult.
In other words, each of these processes is a pet. So how can we turn them into cattle?
Around 2012 a number of organizations identified the need to unify these processes. Many new technologies appeared, one of which was the Docker Container format. It is a format for software distribution that also unifies how production environments deploy applications. This format not only includes all the files required for an application or service, but also includes a standard way to connect and control them. Docker Containers includes meta-information such as which TCP port the service runs on. As a consequence, in a service hosting environment nearly all applications can be deployed the same way. While not every application can work in the Docker Container system, enough can to greatly reduce the number of pets in the environment.
The Docker system includes a number of elements. The Docker Container image is an archive file (like ZIP or TAR) that includes all the files required for a particular service. A Dockerfile is a file that describes how to build an image in an automated fashion, thereby enabling a repeatable process for building images. A Docker compose file defines a complex application made up of many containers, and describes how they talk to each other.
Listing 3.1 is a Dockerfile that describes how to create an image that includes the Apache HTTP server and related files. The EXPOSE 80 statement indicates that the software this image runs needs exclusive access to TCP port 80.
Listing 3.1: A Dockerfile describing how to build a Docker image
FROM ubuntu :12.04 RUN apt -get update && apt -get install -y apache2 && apt -get clean && rm -rf /var/lib/apt/lists /* ENV APACHE_RUN_USER www -data ENV APACHE_RUN_GROUP www -data ENV APACHE_LOG_DIR /var/log/apache2 EXPOSE 80 CMD ["/ usr/sbin/apache2", "-D", "FOREGROUND "]
Listing 3.2 shows a Docker compose file for an application that consists of two services: one that provides a web-based application and another that provides API access. Both require access to a MySQL database and a Redis cache.
Listing 3.2: A Docker compose file for a simple application
services: web: git_url: git@github.com:example/node -js -sample.git git_branch: test command: rackup -p 3000 build_command: rake db:migrate deploy_command: rake db:migrate log_folder: /usr/src/app/log ports: ["3000:80:443" , "4000"] volumes: ["/tmp:/tmp/mnt_folder "] health: default api: image: quay.io/example/node command: node test.js ports: ["1337:8080"] requires: ["web"] databases: - "mysql" - "redis"
With a standardized container format, all applications can be delivered to production in a form so sufficiently self-contained that IT doesn’t need to have a different procedure for each application. While each one is wildly different internally, the process that IT follows to deploy, start, and stop the application is the same.
Containers can be used to build a beta environment. Ideally, the test environment will be as similar to the production environment as possible. Anytime a bug is found in production, it must be reproduced in the test environment to be investigated and fixed. Sometimes a bug can’t be reproduced this way, and fixing it becomes much more difficult.
The reality is that at most companies the beta and production environments are very different: Each is built by a different group of people (developers and SAs) for their own purposes. A story we hear time and time again is that the developers who started the project wrote code that deploys to the beta environment. The SAs were not involved in the project at the time. When it came time to deploy the application into production, the SAs did it manually because the deployment code for the beta environment was unusable anywhere else. Later, if the SAs automated their process, they did it in a different language and made it specific to the production environment. Now two code bases are maintained, and changes to the process must be implemented in code twice. Or, more likely, the changes are silently made to the beta deploy code, and no one realizes it until the next production deployment breaks. This sounds like a silly company that is the exception, but it is how a surprisingly large number of teams operate.
Not only do containers unify the production environment and make it more cattle-like, but they also improve developer productivity. Developers can build a sandbox environment on their personal workstations by selecting the right combination of containers. They can create a mini-version of the production environment that they can use to develop against. Having all this on their laptops is better than sharing or waiting their turn to use a centrally administered test stage environment.
In June 2015 the Open Container Initiative (OCI) was formed to create a single industry-wide standard for container formats and run-times. Docker, Inc., donated its container format and runtime to serve as the basis of this effort.
Containers are just one of many methods for unifying this process.