- Management Reference Guide
- Table of Contents
- Introduction
- Strategic Management
- Establishing Goals, Objectives, and Strategies
- Aligning IT Goals with Corporate Business Goals
- Utilizing Effective Planning Techniques
- Developing Worthwhile Mission Statements
- Developing Worthwhile Vision Statements
- Instituting Practical Corporate Values
- Budgeting Considerations in an IT Environment
- Introduction to Conducting an Effective SWOT Analysis
- IT Governance and Disaster Recovery, Part One
- IT Governance and Disaster Recovery, Part Two
- Customer Management
- Identifying Key External Customers
- Identifying Key Internal Customers
- Negotiating with Customers and Suppliers—Part 1: An Introduction
- Negotiating With Customers and Suppliers—Part 2: Reaching Agreement
- Negotiating and Managing Realistic Customer Expectations
- Service Management
- Identifying Key Services for Business Users
- Service-Level Agreements That Really Work
- How IT Evolved into a Service Organization
- FAQs About Systems Management (SM)
- FAQs About Availability (AV)
- FAQs About Performance and Tuning (PT)
- FAQs About Service Desk (SD)
- FAQs About Change Management (CM)
- FAQs About Configuration Management (CF)
- FAQs About Capacity Planning (CP)
- FAQs About Network Management
- FAQs About Storage Management (SM)
- FAQs About Production Acceptance (PA)
- FAQs About Release Management (RM)
- FAQs About Disaster Recovery (DR)
- FAQs About Business Continuity (BC)
- FAQs About Security (SE)
- FAQs About Service Level Management (SL)
- FAQs About Financial Management (FN)
- FAQs About Problem Management (PM)
- FAQs About Facilities Management (FM)
- Process Management
- Developing Robust Processes
- Establishing Mutually Beneficial Process Metrics
- Change Management—Part 1
- Change Management—Part 2
- Change Management—Part 3
- Audit Reconnaissance: Releasing Resources Through the IT Audit
- Problem Management
- Problem Management–Part 2: Process Design
- Problem Management–Part 3: Process Implementation
- Business Continuity Emergency Communications Plan
- Capacity Planning – Part One: Why It is Seldom Done Well
- Capacity Planning – Part Two: Developing a Capacity Planning Process
- Capacity Planning — Part Three: Benefits and Helpful Tips
- Capacity Planning – Part Four: Hidden Upgrade Costs and
- Improving Business Process Management, Part 1
- Improving Business Process Management, Part 2
- 20 Major Elements of Facilities Management
- Major Physical Exposures Common to a Data Center
- Evaluating the Physical Environment
- Nightmare Incidents with Disaster Recovery Plans
- Developing a Robust Configuration Management Process
- Developing a Robust Configuration Management Process – Part Two
- Automating a Robust Infrastructure Process
- Improving High Availability — Part One: Definitions and Terms
- Improving High Availability — Part Two: Definitions and Terms
- Improving High Availability — Part Three: The Seven R's of High Availability
- Improving High Availability — Part Four: Assessing an Availability Process
- Methods for Brainstorming and Prioritizing Requirements
- Introduction to Disk Storage Management — Part One
- Storage Management—Part Two: Performance
- Storage Management—Part Three: Reliability
- Storage Management—Part Four: Recoverability
- Twelve Traits of World-Class Infrastructures — Part One
- Twelve Traits of World-Class Infrastructures — Part Two
- Meeting Today's Cooling Challenges of Data Centers
- Strategic Security, Part One: Assessment
- Strategic Security, Part Two: Development
- Strategic Security, Part Three: Implementation
- Strategic Security, Part Four: ITIL Implications
- Production Acceptance Part One – Definition and Benefits
- Production Acceptance Part Two – Initial Steps
- Production Acceptance Part Three – Middle Steps
- Production Acceptance Part Four – Ongoing Steps
- Case Study: Planning a Service Desk Part One – Objectives
- Case Study: Planning a Service Desk Part Two – SWOT
- Case Study: Implementing an ITIL Service Desk – Part One
- Case Study: Implementing a Service Desk Part Two – Tool Selection
- Ethics, Scandals and Legislation
- Outsourcing in Response to Legislation
- Supplier Management
- Identifying Key External Suppliers
- Identifying Key Internal Suppliers
- Integrating the Four Key Elements of Good Customer Service
- Enhancing the Customer/Supplier Matrix
- Voice Over IP, Part One — What VoIP Is, and Is Not
- Voice Over IP, Part Two — Benefits, Cost Savings and Features of VoIP
- Application Management
- Production Acceptance
- Distinguishing New Applications from New Versions of Existing Applications
- Assessing a Production Acceptance Process
- Effective Use of a Software Development Life Cycle
- The Role of Project Management in SDLC— Part 2
- Communication in Project Management – Part One: Barriers to Effective Communication
- Communication in Project Management – Part Two: Examples of Effective Communication
- Safeguarding Personal Information in the Workplace: A Case Study
- Combating the Year-end Budget Blitz—Part 1: Building a Manageable Schedule
- Combating the Year-end Budget Blitz—Part 2: Tracking and Reporting Availability
- References
- Developing an ITIL Feasibility Analysis
- Organization and Personnel Management
- Optimizing IT Organizational Structures
- Factors That Influence Restructuring Decisions
- Alternative Locations for the Help Desk
- Alternative Locations for Database Administration
- Alternative Locations for Network Operations
- Alternative Locations for Web Design
- Alternative Locations for Risk Management
- Alternative Locations for Systems Management
- Practical Tips To Retaining Key Personnel
- Benefits and Drawbacks of Using IT Consultants and Contractors
- Deciding Between the Use of Contractors versus Consultants
- Managing Employee Skill Sets and Skill Levels
- Assessing Skill Levels of Current Onboard Staff
- Recruiting Infrastructure Staff from the Outside
- Selecting the Most Qualified Candidate
- 7 Tips for Managing the Use of Mobile Devices
- Useful Websites for IT Managers
- References
- Automating Robust Processes
- Evaluating Process Documentation — Part One: Quality and Value
- Evaluating Process Documentation — Part Two: Benefits and Use of a Quality-Value Matrix
- When Should You Integrate or Segregate Service Desks?
- Five Instructive Ideas for Interviewing
- Eight Surefire Tips to Use When Being Interviewed
- 12 Helpful Hints To Make Meetings More Productive
- Eight Uncommon Tips To Improve Your Writing
- Ten Helpful Tips To Improve Fire Drills
- Sorting Out Today’s Various Training Options
- Business Ethics and Corporate Scandals – Part 1
- Business Ethics and Corporate Scandals – Part 2
- 12 Tips for More Effective Emails
- Management Communication: Back to the Basics, Part One
- Management Communication: Back to the Basics, Part Two
- Management Communication: Back to the Basics, Part Three
- Asset Management
- Managing Hardware Inventories
- Introduction to Hardware Inventories
- Processes To Manage Hardware Inventories
- Use of a Hardware Inventory Database
- References
- Managing Software Inventories
- Business Continuity Management
- Ten Lessons Learned from Real-Life Disasters
- Ten Lessons Learned From Real-Life Disasters, Part 2
- Differences Between Disaster Recovery and Business Continuity , Part 1
- Differences Between Disaster Recovery and Business Continuity , Part 2
- 15 Common Terms and Definitions of Business Continuity
- The Federal Government’s Role in Disaster Recovery
- The 12 Common Mistakes That Cause BIAs To Fail—Part 1
- The 12 Common Mistakes That Cause BIAs To Fail—Part 2
- The 12 Common Mistakes That Cause BIAs To Fail—Part 3
- The 12 Common Mistakes That Cause BIAs To Fail—Part 4
- Conducting an Effective Table Top Exercise (TTE) — Part 1
- Conducting an Effective Table Top Exercise (TTE) — Part 2
- Conducting an Effective Table Top Exercise (TTE) — Part 3
- Conducting an Effective Table Top Exercise (TTE) — Part 4
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part One
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Two
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Three
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Four
- The Information Technology Infrastructure Library (ITIL)
- The Origins of ITIL
- The Foundation of ITIL: Service Management
- Five Reasons for Revising ITIL
- The Relationship of Service Delivery and Service Support to All of ITIL
- Ten Common Myths About Implementing ITIL, Part One
- Ten Common Myths About Implementing ITIL, Part Two
- Characteristics of ITIL Version 3
- Ten Benefits of itSMF and its IIL Pocket Guide
- Translating the Goals of the ITIL Service Delivery Processes
- Translating the Goals of the ITIL Service Support Processes
- Elements of ITIL Least Understood, Part One: Service Delivery Processes
- Case Study: Recovery Reactions to a Renegade Rodent
- Elements of ITIL Least Understood, Part Two: Service Support
- Case Studies
- Case Study — Preparing for Hurricane Charley
- Case Study — The Linux Decision
- Case Study — Production Acceptance at an Aerospace Firm
- Case Study — Production Acceptance at a Defense Contractor
- Case Study — Evaluating Mainframe Processes
- Case Study — Evaluating Recovery Sites, Part One: Quantitative Comparisons/Natural Disasters
- Case Study — Evaluating Recovery Sites, Part Two: Quantitative Comparisons/Man-made Disasters
- Case Study — Evaluating Recovery Sites, Part Three: Qualitative Comparisons
- Case Study — Evaluating Recovery Sites, Part Four: Take-Aways
- Disaster Recovery Test Case Study Part One: Planning
- Disaster Recovery Test Case Study Part Two: Planning and Walk-Through
- Disaster Recovery Test Case Study Part Three: Execution
- Disaster Recovery Test Case Study Part Four: Follow-Up
- Assessing the Robustness of a Vendor’s Data Center, Part One: Qualitative Measures
- Assessing the Robustness of a Vendor’s Data Center, Part Two: Quantitative Measures
- Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part One: What Did the Team Do Well
- (d) Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part Two
This is the final installment of the four-part section that identifies and discusses the 13 cardinal steps, shown in Figure 1, needed to initiate and maintain a business continuity program. In Parts One, Two and Three I covered the first twelve of these steps. In this fourth part I discuss what many believe is the culmination of a successful business continuity program: the conducting of an operational test, or exercise. An operational exercise demonstrates clearly how well critical business processes can be restored, how long it takes to recover them, how much data is lost in the process, and to what degree business continuity plans are valid and viable.
|
Figure 1 The 13 Cardinal Steps of a Business Continuity Program
Step 13: Conduct an Operational Exercise
Conducting an operational exercise is one of the most important activities to do in managing an effective business continuity program. Such an exercise confirms the recoverability of critical business processes, validates the accuracy and thoroughness of plans, and quantifies the amount of time required and the amount of data potentially lost in recovering from a disastrous event. The term 'exercise' is often used in preference to that of 'test' to emphasis the fact that the activity is not designed to be a Pass or Fail test. The recovery exercise intended for planners to learn about, build upon, and make improvements to their overall recovery strategies.
There are 12 key elements of an operational exercise and these are shown in Figure 2.
|
Figure 2 The 12 Key Parts of an Operational Exercise
- Executive Sponsorship – There needs to be one or more executive sponsors who will engage in a number of activities. These include: support the exercise, provide necessary resources (human and otherwise) for the planning and execution of it, offer direction and clarity, resolve competing priorities and other conflicts, finalize the scope of the exercise, assign key roles and generally keep the project moving forward. Executive sponsors usually come from IT, from Risk Management (or to whomever business continuity reports), and from the business unit whose critical processes are being recovered in the exercise.
- Objectives – Executive sponsors and exercise planners should identify and reach consensus on the specific objectives of the exercise. Objectives should include recovering specific business processes and applications within expected timeframes, normally referred to as recovery time objectives (RTOs).
- Scope – Scope describes which business processes and software applications will be included in the exercise, and which will be out-of-scope. In my experience, these two lists change frequently during the first few weeks of planning as needs and urgencies of the business community become better known.
- Assumptions – Assumptions help to clarify which parts of the infrastructure will be thought of as being up or down during the exercise. These include such items as data network segments and voice networks. Other common assumptions involve whether all testing will be done from home or at work, and that at no time during the exercise will the production environment be impacted.
- Participants from Technical Units – The list of participants from the technical units should include system administrators, systems engineers, database administrators, application support, network engineers, appropriate suppliers and other technical support personnel.
- Participants from Business Units – The list of participants from the business units should include all testers, observers, and optionally, executive sponsors.
- Action Items – Throughout the planning and preparation process, numerous action items will come up. These should all be identified, assigned, scheduled (meaning realistic and committed to completion dates) and tracked.
- Technical Recovery Plan – The technical recovery plan is one of the most important documents produced during the planning of an operational exercise. This document prescribes the exact sequence of tasks needed to recovery systems, databases, network segments and other infrastructure components. The document should also contain task dependencies (both predecessor and successor), estimated start and end times and the resulting durations. Durations times are essential to estimate the total expected recover time for each business process. Figure 3 shows a recovery plan template I have used several times to track these measurements.
- Attendance Roster – Not all exercise coordinators keep track of attendance at planning meetings, but I find the practice helpful. I usually construct a color-coded, alphabetically sorted matrix with green indicating attendance or call-ins (we use teleconferencing a lot), yellow meaning that someone represented a person who could not attend, and red indicating absence. These color-coded charts help to spot trends and seem to encourage greater attendance.
- Observations/Issues – All of the diligent planning should come together on the day of the exercise, and anything noteworthy observed during the actual exercise should be recorded. Major issues may need to be escalated and tracked in detail, and the final status of each objective should be noted here.
- Lessons Learned – Within a few days after the exercise, its
coordinator should conduct a lessons learned session to identify what went well
and in which areas improvements could be made. All key participants should
attend this session including business unit testers. The specific mechanic of
conducting such a session can be found in Part 3 of "Conducting an
Effective Table Top Exercise" under Business Continuity Management of this
Guide.
Figure 3 Technical Recovery Plan Tem
- Final Report – The last activity of an operational exercise is to document the entire process, results and lessons learned in a final report. There should be a one or two page executive summary at the beginning that encapsulates the major findings and recommendations of the exercise.
Summary
This concludes the four-part series on developing a successful business continuity program. The 13 steps discussed in these segments cover all of the major areas one needs to address to ensure the effective recovery of critical business processes within an enterprise.