Overview of Content Management
It’s been a hard day’s night, And I’ve been working like a dog.
- —John Lennon and Paul
From Prototype to Enterprise
It has been a wild four years. Rita’s original idea was to build a web site in her spare time to improve the wooden suggestion box outside her company’s lunchroom. Over the years, the penciled suggestions on scraps of paper yielded insightful ways to improve product quality, manufacturing efficiency, and employee morale, to name a few. The suggestions bypassed the traditional chain of command, and although the suggestions were almost always anonymous, Rita’s group tirelessly made special efforts to respond seriously to each one. Perhaps the reason the aging wooden receptacle led to the changes that it did was the immediacy of the follow-ups. Paradoxically, her group had never been formally charged with being the keepers of the suggestion box. It just seemed to be the right thing to do. Rita herself could not have predicted the events that would follow.
2 A.M. Software
It is 1996, and the Internet Age has dawned. No one in the company recalls where the idea came from, but perhaps it doesn’t matter. Why not augment the wooden suggestion box with a web site? That way, employees outside that immediate location can participate as well. Rita takes on the challenge during evenings and weekends. She refers to it as "2 A.M. software."
As a skunkworks project, word of the web site’s existence spreads by word of mouth and internal email. The suggestion box web site enjoys a steadily increasing flow of visitors. At first, visitors from other company locations visit to see firsthand how the democracy of ideas within that humble support facility can lead to tangible improvements. Soon thereafter, suggestions from other sites begin trickling in. Many of the suggestions pertain to company-wide operations.
Rita is the sole part-time web developer on the project. Just before leaving for an out-of-town conference, she demonstrates a prototype to a group of product managers to explain the value of the suggestion system. The product managers offer many suggestions to improve the usability of the interface. After returning from the out-of-town conference, Rita dives into implementing their recommendations.
Without the usual distracting chatter of phone conversations filling the air, Rita adds the pull-down menus to the entry page surprisingly easily. She likes the way that her new menus remove the visual clutter from the page. Eager to show off the new interface, she asks a colleague to try it out. When he tries it, Rita realizes that she hadn’t tested her changes with the Netscape browser. They notice that her menus don’t erase properly. Because she hasn’t tested regularly with different browsers, some new code that she recently added is the likely culprit. But which change is it?
The one-week hiatus renders Rita’s recollection about the web site internals hazy. She needs help recalling what worked and what didn’t work when she last did the demo. Luckily, Rita versioned the entire source tree of the demo web site before leaving on her trip. She’s able to difference the files to figure out what changes are candidates for the browser sensitivity that she seems to have induced by her recent changes.
Web site versioning plays a crucial role in the web development process. It is essential to periodically capture known snapshots of the web site, as we see in Rita’s early efforts with the web site. Snapshots serve several purposes. First, with a known snapshot of the web site, a developer can roll back to a known good copy of the web site. Second, if the web site or a section of it becomes hopelessly broken, a developer needs to be able to selectively pick assets to revert to. An asset is an electronic artifact that embodies intellectual property of an organization. Having a known working set of web assets lets a developer proceed to make changes with the assurance that there’s always the ability to compare to a working copy for ongoing development. Finally, a working copy can be used as a reference copy, to discern what changed and what didn’t. Typically, only a small fraction of a set of files changes from one day or week to the next, which makes having a reference working copy of a web site invaluable for locating the changes that led to a problem. The core issue of site versioning typically arises distinctly early in a web site’s life cycle.
The Pioneers
The team expands to four members. Rita is now the web architect, an informal leader of her band of renegades. They soon find themselves doing independent tasks. Sandy is the lead quality assurance engineer whose primary goal is to build a test harness for the business logic embedded in the web application. She has also volunteered to prototype the online help system. Max is the CGI developer, and he is converting the text file–based suggestion repository to use the corporate-standard commercial database. James is the interface designer, and he explores ways to simplify the interface.
It is Tuesday. James breezes past the unattended reception desk. He hears a few clattering keyboards from the early risers in the office. As he settles into his cubicle, he listens to the reassuring sound of the coffee maker gurgling the morning’s first pot of coffee. James reloads the page that he changed yesterday before he went home.
"What happened to my changes?" James cries out, as he throws his hands into the air. He rechecks the URL to verify that he’s indeed pointing at the right location. The reality sinks in. Someone has overwritten his changes with other changes last night. James gets to his feet. He silently paces the aisle.
Later that day, after much wrangling, the team decides to adopt a practice that gives each developer separate areas to do their work. James does his best to re-create his changes from memory.
Two months pass. The team works 18 hours per day 7 days per week to prepare for its presentation to a corporate Internet task force. To build the team’s pitch, Rita gathers recent examples of business improvements that gathered momentum from their web site. She hopes to solicit support from the high-level executives charged with selecting and funding a promising set of web initiatives identified by the Internet task force.
James has numerous changes to the homepage file, index.html, and to the first-level pages, such as suggest.html, to deliver the promise of his slick new user interface. Max has changes, too. Although he has been working independently and in relative isolation on his database subsystem, it is now time to integrate that code into the homepage and first-level pages as well.
They hammer out an integration plan on Friday after their weekly status meeting, but when 2 A.M. Sunday morning comes around, Max finds that James hasn’t completed the changes that they agreed to. Either James has forgotten or his progress was delayed. Unfortunately, when Max snoops around in the directories on James’s development machine, he sees various versions of half-completed sets of pages. Max is stuck. He has to either make guesses about what James intended to do or call James at home. He relishes neither option. He had intended to finish his integration Saturday night because he promised to help with his niece’s sleepover birthday party on Sunday.
We see the issue of managing concurrent changes coming to the fore when the group reaches four members. Because each person is off doing different projects, and especially because of the nature of web technologies, they hit a "web-wall." This is the point in the life cycle of a web site when the combination of number of developers, the number of assets, and the pace of development exceeds the ability for informal coordination mechanisms to adequately do the job.
The Tornado
Named a finalist by the Internet task force, the team gains additional funding and grows to 20. The pace of change outstrips the ability of any single person to keep track of changes. As of August 1998, the team is tackling the following projects:
Converting to template UI for rebranding
Writing scripts to regenerate UI
Implementing a better help system
Reimplementing form entry based on templates
Building a client portal
Each project involves a cross-functional team of two to four developers. The project to regenerate the template-based user interface has a scripting component, an HTML component, and a user interface design aspect. Keeping track of any one of these projects taken on its own would be difficult enough, but keeping track of any two of these projects simultaneously is beyond the grasp of a single person. All five projects taken together require major infrastructure assistance.
In August, James leads a three-person project to convert to a template-based user interface for branding purposes. The release date is near, and the team is deeply involved in testing. Meanwhile, Joe leads a two-person team getting started on implementing the help system. The help team feels strongly about checking in partial releases of their subsystem. Because they have the suggestion system itself to receive client feedback on their help system, they feel that the potential gains outweigh the risks. The help team has content that has been approved and is ready to send to production. On the day before that, the template team plans to have their subsystem ready.
Project completion skew occurs as the team grows to a point that individuals are doing different things and multiple groups each have a different project focus. In other words, each project is coping with contributors on a project who are working on diverse activities, and each project alone has a need to develop, integrate, test, and review their work before their project can be integrated into the live web site. Inevitably, projects run concurrently, and they don’t all finish at the same time. One project might be getting underway while another project prepares to wrap up its work. There needs to be a way to keep the work separate.
This problem became especially acute for Rita’s group when James and Joe led separate projects. The changes from one project interfered with the other. Because they hadn’t introduced separate edit areas, some of the unfinished changes for the branding project mixed in with the finished changes for the new help system. The mixture of finished and unfinished work proved problematic.
Joe works with the marketing department to build a weekly news section of the web site. For the first few weeks, Joe moves the weekly updates to the production site himself every Saturday at midnight. When that becomes tiresome, he and James agree to share duties on alternate weeks. It gets worse when the marketing department decided to push changes more frequently, now three times per week: a Monday edition, a midweek edition, and a weekend edition of the newsletter. Both of them agree that they need a more automated system. One day over lunch, they wish for a system that will deploy precisely the assets that they specify, at a predetermined time, and to notify them by pager if the automated system encounters errors.
Deployment comprises the processes and practices by which web assets that have been reviewed and approved are copied from a development environment to a production environment. The goal of a deployment infrastructure is to copy assets to the production server into the right location at the appropriate time. Assets no longer on the development side are deleted from the production side.
An important organizational underpinning of a deployment infrastructure is the "release agreement," which binds the development and production groups into a social contract. Content and application developers agree to approve and formally submit any asset to be deployed, and production server administrators agree to use only released assets on a production server. In a well-designed deployment infrastructure, only someone who is authorized to initiate a deployment job does so. A well-designed deployment infrastructure copies assets into production with minimal or no effort, with full control, notification, and the ability to roll back to a known-good version.
Rita’s gang needs an efficient and reliable deployment mechanism almost from the very beginning. Although moving a handful of files to production is simple when taken in isolation, the small overworked staff soon finds itself buried in small trivial tasks that nonetheless are prone to error. It becomes especially difficult when the person copying the files isn’t the one who made the changes and, therefore, isn’t familiar with the files.
With multiple editions of the newsletter per week, and the increased visitor volume, a misstep in the handling of the numerous content sources is bound to happen. Sure enough, it happens at a particularly inopportune time. The company brass had quietly begun investigating the feasibility of selling the web operation to outside buyers, or alternatively, to conduct a public offering. Therefore, it is especially embarrassing that the lead article on a Monday edition of the newsletter misspells the company name of a new partner company. Worse still, it incorrectly identifies the job titles of three of that company’s executives.
Workflow is the process by which people collaborate to develop assets within a content-management system. This issue becomes important when several people collaborate on a job where wait time is a significant proportion of the total job time and where patterns of interaction are repeated frequently. Workflow improves productivity by minimizing the wait time between successive steps, and it automates the business logic of an organization.
One important benefit of workflow is its ability to automate routing, review, and approval of jobs. A second benefit is the ability to enforce a formal business process, as the web operation becomes larger. This advantage becomes especially important when the operation spans different departments.
Rita’s group introduces a formal workflow system after the misspelling fiasco in the online newsletter feature article. The rapid growth in the staff justifies investing in a workflow solution because it ensures that each set of changes has had proper review and approval.
Go Dot-com
Rita’s department spins out as an independent corporation in April 1999. The marketers adopt a dot-com name, ezSuggestionBox.com. Previously they were a division of a small manufacturing company building educational aids for K–12 and the higher education markets. Now they are an independent 100-plus person company. Their customer base is growing, and they have seven major initiatives moving at web speed simultaneously. The industry analysts refer to them as an "application service provider" in the untapped higher-education market segment. Inside the company, they realize that their growth and ultimate survival as a standalone company rest on their ability to continually enhance and extend their service offering, while maintaining their 24x7 uptime promise.
Some of the company’s initiatives immediately change the current web site. For example, without advance notice, the marketing group decides to insert a series of banners on the homepage calling attention to an improved notification service. Other initiatives move at a more deliberate pace, such as the ongoing effort to completely replace the homegrown script-based notification system with a commercial product from a third-party vendor.
Each banner project goes from conception to assignment, to implementation, to review, and to approval within four hours, through the rollout of the new service. But the integration of the commercial notification engine takes six weeks. Neither change can afford to wait for the other.
The issue of long-term versus short-term projects becomes important when there’s a long-term web development effort going on concurrently with short-term changes to a web site. The essential point is that there are changes for the long-term branch that overlap with changes for the short term, and the changes cannot go to production together. For this reason, the development efforts are split apart and done separately.
Rita’s gang encounters the challenge of managing what essentially amounts to two distinctly different web properties. On one hand, there is an ongoing sequence of short-term changes to insert a new banner on the homepage, for which each takes only a few hours to go from implementation to approval. On the other hand, the deeper and more pervasive changes to integrate the third-party notification system take weeks to complete. The solution to this problem involves setting up separate "branches" of development, where each branch corresponds to a logically independent web site.
As the scope of the web operation expands, corporate marketing plays a larger role in directly creating content or sponsoring the creation of content by outside contributors, known as stringers. Nina, the corporate marketing manager, is an example of an inside contributor. When she assembles a press release, she focuses on the market positioning spin, the strategic sprinkling of third-party endorsements contained within the press release, and the go-live date of the release. At the same time, she defers to the art director’s guidance on the layout of the press release detail pages on the corporate site. The same goes for the choice of font for the title on the press release summary page. Most of all, Nina appreciates that previously written press releases will be automatically reformatted to the current design rules.
Separating content from presentation is also known as templating. The assets that comprise a web site must be factored in a way that allows many members of the web team to make changes concurrently. A content contributor is someone whose domain of responsibility focuses on the information content within a web site. Another department or group typically holds responsibility for determining the form of the presentation of the content. Because a large organization finds it essential to separate content from presentation, it centralizes art direction decisions, while it decentralizes content creation. This separation becomes more pronounced as an organization grows.
This is exactly what we see in this story. Nina, the corporate marketing manager, focuses on creating content. Divorcing the content from presentation lets her reuse content in many different situations, some of which was not anticipated when she created the content. For example, another manager uses a one-sentence summary of her piece as a caption on a promotional graphic. For Rita, the key is to design a content-capture framework that maximizes the opportunities for the content to be reused over its lifetime.
By the middle of 2000, ezSuggestionBox.com expands its operations worldwide. Two regional development centers operate out of Boston and Chicago. To satisfy the customer traffic, production web farms operate in San Jose, Boston, Chicago, and London. The total asset base has grown to 500,000 web assets. Multiple content servers, geographically dispersed, host the assets.
The issue of handling very large-scale web sites arises when web operations spread over development locations around the world, or when the volume of web assets exceeds the capacity of a single server to handle. Scalability of the software and hardware infrastructure is an essential consideration, regardless of the particular choice of technology.
In the case of ezSuggestionBox.com, the developers find the need to distribute assets among several content-management servers when their operations expand into regional development centers. They reap the benefits of each development center using a common infrastructure for content management and the basic framework extending to handle future growth in assets.