Lessons Learned
Accuracy of data. One of the first lessons learned was the importance of ensuring the accuracy of frequently-changing information in plans. One aspect of the recovery plans that made them so valuable in this instance was having up-to-date information. This was especially true for telephone numbers, names of current team leads, and locations of equipment. The rigors of updating this information every three months certainly paid off in this case.
Validity of plans. The company's business and technical recovery plans are validated with exercises every six months. These exercises, conducted as a simulated event, helped facilitate a smoother coordination effort during the actual event, and reinforced the value of conducting these types of tests.
Clarification of roles. The business and technical recovery plans specified the roles and responsibilities each group was to take. The actual event validated the roles that had been prescribed, which helped to reinforce the need to keep these roles up to date.
Paralleling of actions. Whenever possible, the recovery teams perform multiple actions in parallel. This strategy significantly reduces the overall recovery time, and ensures that efforts are well-coordinated. For example, desktop operations, telecommunications, and facilities were all able to carry out their actions simultaneously. This technique works best when the efforts are well-coordinated and well-communicated.
Effective use of technical communications. Cell phones were used extensively during this recovery, but in retrospect few of the recovery team members thought to use the two-way instantaneous communication features of their phones. This would have saved valuable time during the early stages of the emergency, enabling some key employees to have responded more quickly. This feature will be emphasized in future simulation exercises, and in recoveries from actual events.
Centralization of recovery efforts. All recovery planning was conducted in a single, centralized location referred to as the "war room." Working together in a central area helped facilitate coordinated recovery efforts. It was agreed that an important aspect of this centralization was publicizing the location of this room to those who needed to be there. But admittance was controlled because, once word of the event became widespread, many people naturally sought admittance into the room to learn the latest details. This led to the next lesson learned.
Importance of human communication. The help desk kept people informed about the event and on the progress of recovery. Sharing this information reduced the number of unnecessary entrants into the war room, and has since been made a formal policy for future events.
Timeliness. Conduct a "lessons learned" session as soon as possible after an event, to capture facts, missteps, and suggestions for improvements as accurately and as completely as possible. In this case, the "lessons learned" session was effectively completed within three working days of the event.
Marketing. Publicize the winning aspects of recovery activities. Nothing breeds success and support, like success and support. An article about the business recovery was widely distributed in the company newsletter.
Members of this company's IT business continuity organization participate in several national organizations to understand and execute current best practices in business continuity. Their dedication paid off on this July workday. And investors who watched the team's well-coordinated response would certainly agree.