Building Software the Extreme Programming Way
See all Sams Teach Yourself on InformIT Programming Tutorials.
In this hour you will learn how XP takes the idea of building or integrating the software and accelerates the cycle. Continuous integration is more than a concept or theory because XP has tools and techniques that make it a reality. In this hour we will cover
What integration is and why frequent integration is important
Where the practice of daily builds came from and how it works
What some of the obstacles to build automation are and how to avoid them
How you can continuously build the system, giving the customer real feedback into the state of the system
The Integration Cycle
Integration is the process of assembling the team's individual components and building them into the software product. This sounds simple enough; everyone ensures his or her components pass unit tests, and then hands them over to the build master or system for final construction. In reality, the number of interactions and dependencies between modules can create a nightmare. The task of troubleshooting build errors can drive the mostly easygoing developer over the edge! This explains the "integration hell" tag! There's a chicken or egg problem here; integration is hard, so it's postponed until the last possible moment, but the greater the time between integrations the more grueling it becomes. Figure 12.1 shows how changes increase over time.
Figure 12.1 Source changes increasing over time.
Changes to the source code that make up the software might not necessarily follow the linear growth we see in Figure 12.1. In some cases the project starts slowly while the basic infrastructure is built, and then source code, quickly grows in size as more components are added. All the more reason to start integrating as soon as possible.
XP places a premium on small releases where the customer sees his software grow in value over time. Small releases sounds like a laudable goal, but how can it be achieved? Waterfall projects, where there might be a single release after months of development, argue that time spent on build automation is wasted. But even with a monolithic release cycle project the team can benefit from frequent builds.
Continuous integration in XP is supported by the practices of Pair Programming, refactoring, collective ownership, testing, and coding standards. Figure 12.2 illustrates how small releases necessitate frequent builds, and this continuous integration is supported by other XP practices.
Figure 12.2 Continuous integration is supported by other XP practices.
Over the next few sections we'll cover why building the software everyday is so crucial to the success of your project. You'll learn where the concept of the "daily build" came from, and how to define what a good build is.
Shipping the Product
Does it really matter how often the system is built? Software development is a means to an end that creates a product with business value for the customer. The foremost role of each team member is to ship the product. Jim McCarthy, former director of Microsoft's Visual C++ team, puts it this way in his book Dynamics of Software Development:
NOTE
From the Dynamics of Software Development, Jim McCarthy, Microsoft Press, 1995.
It's absolutely vital to the success of the project that the team stays focused on shipping the product; all else is secondary. Developers aren't simply writing and compiling code, but are integral to the customer's product release. The all-too-common alternative is that developers "go dark" while the heavy work of construction is underway. It's almost like the customer is pacing back and forth at the delivery room door. Will it be a boy or a girl? (I better not take this analogy too far!) In all seriousness though, XP holds to rapid feedback between technical and business. Developers listen a little, build a little, and then demonstrate to the customer.
Working software (or product) is the heartbeat the customer is interested in monitoring.
Building the Software Everyday
Microsoft is generally credited with inventing the concept of the daily build. Whether this is actually true is a moot point, and they certainly used it to good effect during their Windows NT development. A daily build is where automated scripts are run (usually nightly) to extract source from the version control system, compile, link, and then build. Normally, at the end of the build some kind of smoke test is run to check build status.
The daily build of the product is one of the most important diagnostic tools for the health of the software.
NOTE
The term smoke test is derived from the electronics industry where people would plug in a circuit board or component, and then applied power to see what smoked.
These tests ensure the basic stability of the product by checking daily that nothing has been checked in that jeopardizes the basic product. If something is uncovered, it is repaired immediately. This provides a foundation for the new work to be added each day and ensures that the project doesn't stray too far from a stable working version.
Figure 12.3 illustrates how developers work by checking-out the source from the repository, and then use a dedicated build machine to run builds as required.
Figure 12.3 Developers using source control and a dedicated build machine.
The importance of a regular build cannot be underestimated. A good way of thinking about it is to consider how much time it would take to recover from a fatal flaw in a check-in. If you build weekly, and someone checks in erroneous information on Monday that you don't find until you build on Friday, you might have to repair a week of work before health is restored. You've lost two weeks. If you built daily, you would lose only two days.
Building frequently is fundamentally about getting the project to a known state and ensuring that it stays there.
Defining a Good Build
Writing scripts that pull together the source and build the product is certainly the basis for rapid development, but what is a good build? If everyone's role is to ship the software, the ability to run a few scripts that attempt a build means nothing by itself. The build must pass a series of verification tests that enable the team to call the build good. Depending on your project, you might devise your own criteria for a successful build; here is an outline to start with:
All working source files are checked into the source control repository.
All source files are compiled (irrespective of date).
Object or intermediate files are linked to create deployment binaries (such as executables, JAR files, and dynamic link libraries).
The system is verified against a series of tests; this smoke test ensures the testability of the system.
The build can be called good or successful if the system passes to verification or smoke tests. It's ready for testing or review by the customer and team.
The team has two choices if the build breaks (fails tests):
Fix problems immediately. If the build cycle is short, that is fewer changes between builds, problems will be easier to track down. Going home if your code broke the build is not just bad form, but it lets the whole team down. Collective ownership means collective responsibility. Fixing the broken build remains the top priority of the development team.
Back out the changes. The team might decide to back out changes in a component if integration problems are traced to fundamental flaws or problemsfor example, in the first time a third-party module is used in the build. Generally speaking, XP minimizes the risk of complicated integration problems by increasing the frequencies of builds. However, we should at least recognize the possibility of unknowns that threaten to sidetrack the team. The better decision is to rollback the change and perform more testing before integrating again.
Integration HellWhen Will We Ever Learn?
I had a couple of recent experiences that illustrate why frequent builds will save your skin and sanity. I watched as a project with 10 or so developers struggled time and time again to integrate every six to eight weeks. For some reason they never had the time to get their source and build house in order! Too much work, too much trouble. Each build weekend followed the same pattern of name-calling, finger-pointing, and raised voices. The common statement was "it works fine on my machine." Unfortunately, the customer was not running their global knowledge portal on the developer's PC! Here's an interesting insight into a developer mindset: They claimed the integration was fine, and the problems were caused by deployment. So, the build worked (compiled), but didn't work (pass any tests). The team stubbornly refused to learn the hard lessons and persisted in lack of source control and manual builds until the bitter end.
I was called in after a subsequent offshoot development began to flounder and mandated automated build. We took maybe two days to clean source control and write batch scripts. From this point onward the build took around 10 minutes from start to finish, including deployment to test servers.
I'll take automation anytime; I like my weekends.
Placing the Source into the Safe
The keystone of building, no matter how often, is that all source must reside on a source control repository. Your development tool will dictate, up to a point, which product you choose as its integration with the IDE could be a definite plus for developers. Source control is easiest to get under control at the start of the project. Tidying up in mid project is a headache for all concerned. Some teams like to plan out the structure of the source management system early in the project. You can take this approach or decide a high-level layout, leaving the remainder until required. Figure 12.4 demonstrates a possible structure for your source management system.
Figure 12.4 Suggested source repository structure.
An interesting thing to note is that we have a build folder. In actuality, this is no more than a mirror or reflection of the underlying source folders. The reason for this technique is that it can simplify the build process by only requiring the build tool to get source from one tree. Most source systems enable users to replicate or shadow working folders to others in the repository. Otherwise, you'd need to check out or get (read-only copy of files), and then mask the folders that were not required (documentation, and so on). Applying a recursive "get" will usually result in a folder structure on the build machine that mimics the repository. Your job of writing a build script will be much more complicated!
Another frequently missed aspect of source control is that all supporting libraries and binaries should also be included. Keep in mind that your goal is to build the entire application from the source-management system, and failing to include your external components will stymie this. You should be able to point at a clean or pristine machine at the source database, run the base build script, and then watch as the system is built from scratch. This is not as hard as it sounds and requires dedication rather than technical brilliance.
TIP
At the risk of stating the obvious: Your source-management system should be backed up! Don't let one hard-drive failure ruin all your hard-won gains from source management.