- Management Reference Guide
- Table of Contents
- Introduction
- Strategic Management
- Establishing Goals, Objectives, and Strategies
- Aligning IT Goals with Corporate Business Goals
- Utilizing Effective Planning Techniques
- Developing Worthwhile Mission Statements
- Developing Worthwhile Vision Statements
- Instituting Practical Corporate Values
- Budgeting Considerations in an IT Environment
- Introduction to Conducting an Effective SWOT Analysis
- IT Governance and Disaster Recovery, Part One
- IT Governance and Disaster Recovery, Part Two
- Customer Management
- Identifying Key External Customers
- Identifying Key Internal Customers
- Negotiating with Customers and Suppliers—Part 1: An Introduction
- Negotiating With Customers and Suppliers—Part 2: Reaching Agreement
- Negotiating and Managing Realistic Customer Expectations
- Service Management
- Identifying Key Services for Business Users
- Service-Level Agreements That Really Work
- How IT Evolved into a Service Organization
- FAQs About Systems Management (SM)
- FAQs About Availability (AV)
- FAQs About Performance and Tuning (PT)
- FAQs About Service Desk (SD)
- FAQs About Change Management (CM)
- FAQs About Configuration Management (CF)
- FAQs About Capacity Planning (CP)
- FAQs About Network Management
- FAQs About Storage Management (SM)
- FAQs About Production Acceptance (PA)
- FAQs About Release Management (RM)
- FAQs About Disaster Recovery (DR)
- FAQs About Business Continuity (BC)
- FAQs About Security (SE)
- FAQs About Service Level Management (SL)
- FAQs About Financial Management (FN)
- FAQs About Problem Management (PM)
- FAQs About Facilities Management (FM)
- Process Management
- Developing Robust Processes
- Establishing Mutually Beneficial Process Metrics
- Change Management—Part 1
- Change Management—Part 2
- Change Management—Part 3
- Audit Reconnaissance: Releasing Resources Through the IT Audit
- Problem Management
- Problem Management–Part 2: Process Design
- Problem Management–Part 3: Process Implementation
- Business Continuity Emergency Communications Plan
- Capacity Planning – Part One: Why It is Seldom Done Well
- Capacity Planning – Part Two: Developing a Capacity Planning Process
- Capacity Planning — Part Three: Benefits and Helpful Tips
- Capacity Planning – Part Four: Hidden Upgrade Costs and
- Improving Business Process Management, Part 1
- Improving Business Process Management, Part 2
- 20 Major Elements of Facilities Management
- Major Physical Exposures Common to a Data Center
- Evaluating the Physical Environment
- Nightmare Incidents with Disaster Recovery Plans
- Developing a Robust Configuration Management Process
- Developing a Robust Configuration Management Process – Part Two
- Automating a Robust Infrastructure Process
- Improving High Availability — Part One: Definitions and Terms
- Improving High Availability — Part Two: Definitions and Terms
- Improving High Availability — Part Three: The Seven R's of High Availability
- Improving High Availability — Part Four: Assessing an Availability Process
- Methods for Brainstorming and Prioritizing Requirements
- Introduction to Disk Storage Management — Part One
- Storage Management—Part Two: Performance
- Storage Management—Part Three: Reliability
- Storage Management—Part Four: Recoverability
- Twelve Traits of World-Class Infrastructures — Part One
- Twelve Traits of World-Class Infrastructures — Part Two
- Meeting Today's Cooling Challenges of Data Centers
- Strategic Security, Part One: Assessment
- Strategic Security, Part Two: Development
- Strategic Security, Part Three: Implementation
- Strategic Security, Part Four: ITIL Implications
- Production Acceptance Part One – Definition and Benefits
- Production Acceptance Part Two – Initial Steps
- Production Acceptance Part Three – Middle Steps
- Production Acceptance Part Four – Ongoing Steps
- Case Study: Planning a Service Desk Part One – Objectives
- Case Study: Planning a Service Desk Part Two – SWOT
- Case Study: Implementing an ITIL Service Desk – Part One
- Case Study: Implementing a Service Desk Part Two – Tool Selection
- Ethics, Scandals and Legislation
- Outsourcing in Response to Legislation
- Supplier Management
- Identifying Key External Suppliers
- Identifying Key Internal Suppliers
- Integrating the Four Key Elements of Good Customer Service
- Enhancing the Customer/Supplier Matrix
- Voice Over IP, Part One — What VoIP Is, and Is Not
- Voice Over IP, Part Two — Benefits, Cost Savings and Features of VoIP
- Application Management
- Production Acceptance
- Distinguishing New Applications from New Versions of Existing Applications
- Assessing a Production Acceptance Process
- Effective Use of a Software Development Life Cycle
- The Role of Project Management in SDLC— Part 2
- Communication in Project Management – Part One: Barriers to Effective Communication
- Communication in Project Management – Part Two: Examples of Effective Communication
- Safeguarding Personal Information in the Workplace: A Case Study
- Combating the Year-end Budget Blitz—Part 1: Building a Manageable Schedule
- Combating the Year-end Budget Blitz—Part 2: Tracking and Reporting Availability
- References
- Developing an ITIL Feasibility Analysis
- Organization and Personnel Management
- Optimizing IT Organizational Structures
- Factors That Influence Restructuring Decisions
- Alternative Locations for the Help Desk
- Alternative Locations for Database Administration
- Alternative Locations for Network Operations
- Alternative Locations for Web Design
- Alternative Locations for Risk Management
- Alternative Locations for Systems Management
- Practical Tips To Retaining Key Personnel
- Benefits and Drawbacks of Using IT Consultants and Contractors
- Deciding Between the Use of Contractors versus Consultants
- Managing Employee Skill Sets and Skill Levels
- Assessing Skill Levels of Current Onboard Staff
- Recruiting Infrastructure Staff from the Outside
- Selecting the Most Qualified Candidate
- 7 Tips for Managing the Use of Mobile Devices
- Useful Websites for IT Managers
- References
- Automating Robust Processes
- Evaluating Process Documentation — Part One: Quality and Value
- Evaluating Process Documentation — Part Two: Benefits and Use of a Quality-Value Matrix
- When Should You Integrate or Segregate Service Desks?
- Five Instructive Ideas for Interviewing
- Eight Surefire Tips to Use When Being Interviewed
- 12 Helpful Hints To Make Meetings More Productive
- Eight Uncommon Tips To Improve Your Writing
- Ten Helpful Tips To Improve Fire Drills
- Sorting Out Today’s Various Training Options
- Business Ethics and Corporate Scandals – Part 1
- Business Ethics and Corporate Scandals – Part 2
- 12 Tips for More Effective Emails
- Management Communication: Back to the Basics, Part One
- Management Communication: Back to the Basics, Part Two
- Management Communication: Back to the Basics, Part Three
- Asset Management
- Managing Hardware Inventories
- Introduction to Hardware Inventories
- Processes To Manage Hardware Inventories
- Use of a Hardware Inventory Database
- References
- Managing Software Inventories
- Business Continuity Management
- Ten Lessons Learned from Real-Life Disasters
- Ten Lessons Learned From Real-Life Disasters, Part 2
- Differences Between Disaster Recovery and Business Continuity , Part 1
- Differences Between Disaster Recovery and Business Continuity , Part 2
- 15 Common Terms and Definitions of Business Continuity
- The Federal Government’s Role in Disaster Recovery
- The 12 Common Mistakes That Cause BIAs To Fail—Part 1
- The 12 Common Mistakes That Cause BIAs To Fail—Part 2
- The 12 Common Mistakes That Cause BIAs To Fail—Part 3
- The 12 Common Mistakes That Cause BIAs To Fail—Part 4
- Conducting an Effective Table Top Exercise (TTE) — Part 1
- Conducting an Effective Table Top Exercise (TTE) — Part 2
- Conducting an Effective Table Top Exercise (TTE) — Part 3
- Conducting an Effective Table Top Exercise (TTE) — Part 4
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part One
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Two
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Three
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Four
- The Information Technology Infrastructure Library (ITIL)
- The Origins of ITIL
- The Foundation of ITIL: Service Management
- Five Reasons for Revising ITIL
- The Relationship of Service Delivery and Service Support to All of ITIL
- Ten Common Myths About Implementing ITIL, Part One
- Ten Common Myths About Implementing ITIL, Part Two
- Characteristics of ITIL Version 3
- Ten Benefits of itSMF and its IIL Pocket Guide
- Translating the Goals of the ITIL Service Delivery Processes
- Translating the Goals of the ITIL Service Support Processes
- Elements of ITIL Least Understood, Part One: Service Delivery Processes
- Case Study: Recovery Reactions to a Renegade Rodent
- Elements of ITIL Least Understood, Part Two: Service Support
- Case Studies
- Case Study — Preparing for Hurricane Charley
- Case Study — The Linux Decision
- Case Study — Production Acceptance at an Aerospace Firm
- Case Study — Production Acceptance at a Defense Contractor
- Case Study — Evaluating Mainframe Processes
- Case Study — Evaluating Recovery Sites, Part One: Quantitative Comparisons/Natural Disasters
- Case Study — Evaluating Recovery Sites, Part Two: Quantitative Comparisons/Man-made Disasters
- Case Study — Evaluating Recovery Sites, Part Three: Qualitative Comparisons
- Case Study — Evaluating Recovery Sites, Part Four: Take-Aways
- Disaster Recovery Test Case Study Part One: Planning
- Disaster Recovery Test Case Study Part Two: Planning and Walk-Through
- Disaster Recovery Test Case Study Part Three: Execution
- Disaster Recovery Test Case Study Part Four: Follow-Up
- Assessing the Robustness of a Vendor’s Data Center, Part One: Qualitative Measures
- Assessing the Robustness of a Vendor’s Data Center, Part Two: Quantitative Measures
- Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part One: What Did the Team Do Well
- (d) Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part Two
This is the third installment of a four-part section that identifies and discusses the 13 cardinal steps (see Figure 1) needed to initiate and maintain a business continuity program. In Part One I covered the first four of these steps and in the second part I discussed the steps five through eight. In this installment I explain steps nine through twelve. Steps nine and ten involve the development of business continuity recovery plans oriented toward business users and technical users, respectively. Steps eleven and twelve describe how to conduct validation and simulation tests. Step thirteen is the topic of Part Four of this series and explores operational tests, the most comprehensive and complex of the three testing exercises.
|
Figure 1 The 13 Cardinal Steps of a Business Continuity Program
Step 9: Develop Recovery Plans Oriented To Business Users
Up to this point I have shown you how to identify the critical business processes that need to be restored in the event of a disaster, and how to develop the high-level and detailed recovery strategies needed to enact such a restoration. We next need to develop the actual business continuity plans that will be used by business users to recover their critical processes.
I have seen a variety of methods used to develop such plans. Some shops keep it very simple and use nothing more than Word documents to prescribe their recovery steps. On the other end of the spectrum are those who use sophisticated, and expensive, tools specifically designed to this purpose. Many of my clients use a SQL relational database product from Strohl Software called the Living Disaster Recovery Planning System (LDRPS). It is very comprehensive and ideal for large shops with hundreds of plans to maintain. Many financial organizations use LDRPS because of their need to centralize and standardize plans for hundreds of branch offices.
The disaster recovery service provider Sungard also provides a tool, slightly less sophisticated than LDRPS, for developing plans. IBM and HP also supply business continuity plan development tools. Regardless of the tool selected, I believe there are six important attributes that characterize an effective business continuity plan:
- Understandable – use simple wording that the reader will comprehend
- Comprehensive – include all critical business processes and their dependencies
- Accurate – ensure currency of phone numbers, personnel, software, hardware
- Accessible – make the plans easily accessible; consider keeping copies on laptops, in thumb-drives, or at-home hardcopies
- Maintainable – develop plans that are easy to update and distribute
- Organized – organize the plan in a logical manner that follows actual recovery
As to the organization of the plan, it usually follows a pattern of four main sections, each with subgroups:
Response
- Call trees
- Internal contacts
Resources
- recovery teams
- suppliers
- customers
- software
- hardware
Recovery
- relocation procedures
- business processes and dependencies
- special supplies and telecommunications
Resumption
- reverting back to permanent site
- analysis of impact of the event
- documentation of unique information
Business recovery plans will vary in size, complexity and scope depending on the type of environment they pertain to, but all will have these essential parts included in them.
Step 10: Develop Recovery Plans Oriented To Technical Users
Business continuity recovery plans oriented to technical users are very similar to those oriented to business users with one important exception: technical plans include steps to recover the IT infrastructure. Most business processes today depend heavily on software applications, databases, and network connections. These are the essential components of an IT infrastructure, and must be recovered in the event of a disaster in order to restore the business processes they support.
Some shops still refer to these types of IT business continuity plans as disaster recovery plans. If the components being restored are of a technical nature then this would be true. But normally there are business processes associated with the IT environment and for this reason the element of business continuity becomes a part of these plans as well.
Step 11: Conduct Validation Tests
There are primarily three types of testing, or exercises, used with business continuity plans:
- Validation tests (conducted approximately every 3-6 months)
- Simulation tests (conducted approximately every 6-12 months)
- Operational tests (conducted approximately every 12-18 months)
This section describes validation tests, and the next two sections describe the other two. A validation test verifies the accuracy of the data within the plan. The specific data checked for includes:
- employees' office telephone numbers
- employees' mobile telephone numbers
- employees' home telephone numbers
- customers' contact information
- suppliers' contact information
- identification of all critical business processes
- current recovery time objectives (RTOs) of all processes
- current response point objectives (RPOs) of all processes
- all dependencies of all critical business processes
- identification of all currently needed software
- current version, release and patch levels of software
- identification of all currently needed hardware
- current model numbers of all needed hardware
Planners usually organize telephone numbers into call trees in which a higher level person, such as a manager or a lead, calls several subordinates who in turn may call other members of the team. In this way planners can contact the maximum number of individuals in the minimum amount of time. Organizers conduct call tree tests by having each person who is assigned numbers actually call the individuals, usually off hours, and tracking if the people called and the numbers used are still accurate.
Plan owners normally contact business users to verify that business processes and their dependencies are still valid. Similarly, planners will contact appropriate IT personnel and suppliers to ensure that software versions and hardware model numbers remain current.
Step 12: Conduct Simulation Tests
A simulation test is often referred to as a Table Top Exercise because it is usually conducted with all key participants of the recovery sitting around a table (or teleconferencing in) and going through the business continuity plan step by step to assess the validity and viability of the plan. A previous segment of this Management Guide offers a detailed discussion of this topic in a four-part series under the heading of 'Conducting an Effective Table Top Exercise' in the Business Continuity Section.
This third part covered the development of business continuity plans for both business and technical users, and the conducting of two of the three types of tests: validation and simulation. Part Four is the final installment of this series on implementing a business continuity program. It explains operational testing in which business processes and their supporting software applications are functionally restored and tested by business users at recovery sites.