Approaches to Content Management
Managing Web Assets
Web development groups change rapidly, especially the successful ones. They hire more staff. Teams tackle business objectives that expand and transform over time. The pace is relentless. Above all, the size of the operation tends to grow quickly. (See Figure 1.) Sometimes the business grows. The operation grows as fast as people can be hired, or the operation expands as it validates its business model.
Figure 1 Web operations tend to grow rapidly.
As the size of the web operation increases, different techniques for managing the web property come into play. As we'll see, each technique overcomes important problems faced during development. Each technique has advantages and limitations.
In the following sections, we'll introduce four approaches to managing assets for a web property. The first one is suitable for a small web site consisting of fewer than a hundred assets. The others make sense for successively larger web operations, up through enterprise-class web sites with millions of assets.
Live Editing
For a very small web site produced by one or two developers, where the site consists of fewer than 100 files, it is common for the developers to edit the live web site directly. (See Figure 2.) The approach is simple. There is one copy of the web site, the live production copy. To make a change, edit the asset directly.
Because small web sites are typically exploratory efforts, uptime isn't essential. The number of hits on the web is small, and the fact that files may be temporarily missing or incorrect on the production web site doesn't affect many visitors. In Rita's story, live editing is adequate in the early days, when her primary goal is to establish a proof of concept, during her "2 A.M. software" era.
Figure 2 Developers on small sites often edit the live web site directly.
This arrangement has the advantage of being simple to administer. Fix what needs to be fixed. Visit the live site to see if the site works. This simple scheme works because the entire site is under one person's management, and the live site is the latest working copy of the site.
The major disadvantage of this arrangement is that there is little control of the site. Typically, the only version control consists of making an entire copy of the web site occasionally. If a problem is discovered and there's a need to revert some of the site, or if the entire site to a previous working version, the copies are examined to determine which copy to roll back to.
Directly making changes on the production site has other unfortunate consequences. First, the solution doesn't scale on a number of dimensions. Only a handful of people can work on a live production site if a mistake becomes visible to the user base. This makes it impossible to accomplish objectives more rapidly by boosting staffing. Because the site must be fully functioning at all times, this severely limits the kinds of projects that can be undertaken. Any kind of change that requires multiple files to change in a coordinated fashion, or that requires any testing at all, cannot be undertaken on a live site.
Finally, and perhaps most insidious, live editing promotes a self-limiting work culture that impedes future expansion. By its nature, live editing encourages a "web cowboy" mentality that hinders future development. One example is the ability for different production web server to host a web property without rewriting the internal references. For example, if the web property of a hypothetical firm, the General Bot Corporation, uses fully qualified references, such as href=http://www.generalbot.com/privacypolicy.html, then this limits the ability to test the web assets before moving them into production. Suppose, for instance, that the company evaluates competing web-hosting vendors to more economically and reliably serve the corporate web site. This requires copying a snapshot of the web property to a test server to measure how different server configurations handle simulated loads. In our example, we want the test web server to handle the reference to the privacy policy, instead of unconditionally redirecting the viewer to the main corporate site. Internal references relative to the "docroot" should be made to href=/privacypolicy.html instead. This more closely expresses the true intent of the reference. The practice of live editing tends to mask the distinction between internal and external references, which impedes future expansion of the development team and the testing effort.
Staging Web Site
As the number of assets grows and the number of developers increases, it is no longer practical to edit the production web server directly. Enter a separate web server, or staging server. It runs a copy of the production web site, with the difference that we copy changes that the developers intend to put into production on the staging web server first. (See Figure 3.) This solution works on sites up to 1,000 assets, when the number of developers is less than 5.
The staging web site solution is adequate when Rita's gang consists of a handful of developers (see Article 2). We see that as the efforts of the developers begin to take divergent paths, managing the changes and the testing within a single staging web site has significant limitations.
Figure 3 Copying a modified asset to a staging web site allows a development team to test changes before deploying them to the production web site.
The staging server introduces the important ability to test changes before they go live. Developers are able to detect errors before they reach the production site. This solution is similar to the live edit procedure, except that developers point their browser at the staging server instead of at the production server. As long as the number of developers is less than five or so, then the improvement in the ability to test outweighs the additional burden of the two-step procedure. First, move changes to the staging server. Second, copy the changes to the production server.
This solution begins to break down as the number of developer increases beyond a handful. With more developers, it becomes harder to keep track of individual changes. That trap befell Rita's organization in the second article. Although Rita's development organization deserves credit to have progressed to a level of maturity where they recognize the importance of testing web content before going live, they encounter the principal limitation with the staging server model. The development team has too many members for each person to understand which changes belong to whom and which changes are or are not ready to go into production.
Independent Edit Areas
At the next level of sophistication, development groups retain the staging server but proceed to give each developer an independent area in which to make and test their changes. (See Figure 4.) This partially solves the problem of changes by one developer stepping on one another because each developer has a web server and a separate area in which to make changes. This has the additional benefit that each developer is able to test changes independently. This approach has some amount of success with development teams up to 8, with number of assets fewer than 2,000 to 5,000.
Rita's band of renegades finds the independent edit areas approach useful when branching out into small project teams working on different assets. Keeping an accurate version history is important, however, and it is wise to use this technique in conjunction with source code versioning tools.
Figure 4 Developers on larger sites use separate edit areas to make and test changes.
There are two drawbacks to this approach. First, as the number of independent areas increases to accommodate the developers, the total space consumed in the file system for each copy of the assets increases. Second, conflicts between areas become harder to keep track of. This happens because, with many developers, the likelihood of two people changing the same file increases. Let's suppose that two developers need to change index.html. One of them completes his change and copies the change into the staging server. Assistance from a version control or content-management tool is required to make a second developer aware that the changes to index.html are now in conflict with the latest version of index.html. Without this kind of assistance, it is very likely that the second developer will overwrite the index.html in the staging server with his modified version of index.html. This effectively overwrites the changes from the first developer.
Content Management
When the asset count exceeds 2,000 to 5,000 files or the web team exceeds 10–12 content developers and code developers, then it usually becomes necessary to adopt a content-management tool. Formal support from a content-management system overcomes the drawbacks of informal solutions, such as the ones described earlier. Content management is a discipline that manages the timely, accurate, collaborative, iterative, and reproducible development of a web property. (See Figure 5.) It combines a mechanism to store a collection of web assets with processes that seamlessly mesh the activities of people and machines within an organization. Content management responds to the unique combination of problems posed by web development.
Rita's gang should reasonably expect to support their activities with a content-management approach by the time their group size reaches 12. If they allow time to evaluate, purchase, and implement such a system, and if they consider their rapid growth, they could reasonably start the process when their team is 8–10 people, depending on their hiring rate.
Figure 5 Content management orchestrates the development, testing, review, and deployment of web assets.
Because web efforts tend to expand quickly, both in terms of number of assets and size of staff, it often makes sense to introduce formal content management well in advance of crossing the asset and team size threshold suggested earlier. As rule of thumb, you should initiate the introduction of content management before your effort crosses the file count and team size thresholds—say, six months in advance. This gives time to evaluate tools, solicit budgetary approval, complete the purchase, implement the tool set, and train your staff before the need becomes so critical that the absence of a content-management solution impacts your business. In addition, the overall training cost is lower if you introduce content-management techniques when your staff is smaller and fewer people become entrenched in bad habits.