- Management Reference Guide
- Table of Contents
- Introduction
- Strategic Management
- Establishing Goals, Objectives, and Strategies
- Aligning IT Goals with Corporate Business Goals
- Utilizing Effective Planning Techniques
- Developing Worthwhile Mission Statements
- Developing Worthwhile Vision Statements
- Instituting Practical Corporate Values
- Budgeting Considerations in an IT Environment
- Introduction to Conducting an Effective SWOT Analysis
- IT Governance and Disaster Recovery, Part One
- IT Governance and Disaster Recovery, Part Two
- Customer Management
- Identifying Key External Customers
- Identifying Key Internal Customers
- Negotiating with Customers and Suppliers—Part 1: An Introduction
- Negotiating With Customers and Suppliers—Part 2: Reaching Agreement
- Negotiating and Managing Realistic Customer Expectations
- Service Management
- Identifying Key Services for Business Users
- Service-Level Agreements That Really Work
- How IT Evolved into a Service Organization
- FAQs About Systems Management (SM)
- FAQs About Availability (AV)
- FAQs About Performance and Tuning (PT)
- FAQs About Service Desk (SD)
- FAQs About Change Management (CM)
- FAQs About Configuration Management (CF)
- FAQs About Capacity Planning (CP)
- FAQs About Network Management
- FAQs About Storage Management (SM)
- FAQs About Production Acceptance (PA)
- FAQs About Release Management (RM)
- FAQs About Disaster Recovery (DR)
- FAQs About Business Continuity (BC)
- FAQs About Security (SE)
- FAQs About Service Level Management (SL)
- FAQs About Financial Management (FN)
- FAQs About Problem Management (PM)
- FAQs About Facilities Management (FM)
- Process Management
- Developing Robust Processes
- Establishing Mutually Beneficial Process Metrics
- Change Management—Part 1
- Change Management—Part 2
- Change Management—Part 3
- Audit Reconnaissance: Releasing Resources Through the IT Audit
- Problem Management
- Problem Management–Part 2: Process Design
- Problem Management–Part 3: Process Implementation
- Business Continuity Emergency Communications Plan
- Capacity Planning – Part One: Why It is Seldom Done Well
- Capacity Planning – Part Two: Developing a Capacity Planning Process
- Capacity Planning — Part Three: Benefits and Helpful Tips
- Capacity Planning – Part Four: Hidden Upgrade Costs and
- Improving Business Process Management, Part 1
- Improving Business Process Management, Part 2
- 20 Major Elements of Facilities Management
- Major Physical Exposures Common to a Data Center
- Evaluating the Physical Environment
- Nightmare Incidents with Disaster Recovery Plans
- Developing a Robust Configuration Management Process
- Developing a Robust Configuration Management Process – Part Two
- Automating a Robust Infrastructure Process
- Improving High Availability — Part One: Definitions and Terms
- Improving High Availability — Part Two: Definitions and Terms
- Improving High Availability — Part Three: The Seven R's of High Availability
- Improving High Availability — Part Four: Assessing an Availability Process
- Methods for Brainstorming and Prioritizing Requirements
- Introduction to Disk Storage Management — Part One
- Storage Management—Part Two: Performance
- Storage Management—Part Three: Reliability
- Storage Management—Part Four: Recoverability
- Twelve Traits of World-Class Infrastructures — Part One
- Twelve Traits of World-Class Infrastructures — Part Two
- Meeting Today's Cooling Challenges of Data Centers
- Strategic Security, Part One: Assessment
- Strategic Security, Part Two: Development
- Strategic Security, Part Three: Implementation
- Strategic Security, Part Four: ITIL Implications
- Production Acceptance Part One – Definition and Benefits
- Production Acceptance Part Two – Initial Steps
- Production Acceptance Part Three – Middle Steps
- Production Acceptance Part Four – Ongoing Steps
- Case Study: Planning a Service Desk Part One – Objectives
- Case Study: Planning a Service Desk Part Two – SWOT
- Case Study: Implementing an ITIL Service Desk – Part One
- Case Study: Implementing a Service Desk Part Two – Tool Selection
- Ethics, Scandals and Legislation
- Outsourcing in Response to Legislation
- Supplier Management
- Identifying Key External Suppliers
- Identifying Key Internal Suppliers
- Integrating the Four Key Elements of Good Customer Service
- Enhancing the Customer/Supplier Matrix
- Voice Over IP, Part One — What VoIP Is, and Is Not
- Voice Over IP, Part Two — Benefits, Cost Savings and Features of VoIP
- Application Management
- Production Acceptance
- Distinguishing New Applications from New Versions of Existing Applications
- Assessing a Production Acceptance Process
- Effective Use of a Software Development Life Cycle
- The Role of Project Management in SDLC— Part 2
- Communication in Project Management – Part One: Barriers to Effective Communication
- Communication in Project Management – Part Two: Examples of Effective Communication
- Safeguarding Personal Information in the Workplace: A Case Study
- Combating the Year-end Budget Blitz—Part 1: Building a Manageable Schedule
- Combating the Year-end Budget Blitz—Part 2: Tracking and Reporting Availability
- References
- Developing an ITIL Feasibility Analysis
- Organization and Personnel Management
- Optimizing IT Organizational Structures
- Factors That Influence Restructuring Decisions
- Alternative Locations for the Help Desk
- Alternative Locations for Database Administration
- Alternative Locations for Network Operations
- Alternative Locations for Web Design
- Alternative Locations for Risk Management
- Alternative Locations for Systems Management
- Practical Tips To Retaining Key Personnel
- Benefits and Drawbacks of Using IT Consultants and Contractors
- Deciding Between the Use of Contractors versus Consultants
- Managing Employee Skill Sets and Skill Levels
- Assessing Skill Levels of Current Onboard Staff
- Recruiting Infrastructure Staff from the Outside
- Selecting the Most Qualified Candidate
- 7 Tips for Managing the Use of Mobile Devices
- Useful Websites for IT Managers
- References
- Automating Robust Processes
- Evaluating Process Documentation — Part One: Quality and Value
- Evaluating Process Documentation — Part Two: Benefits and Use of a Quality-Value Matrix
- When Should You Integrate or Segregate Service Desks?
- Five Instructive Ideas for Interviewing
- Eight Surefire Tips to Use When Being Interviewed
- 12 Helpful Hints To Make Meetings More Productive
- Eight Uncommon Tips To Improve Your Writing
- Ten Helpful Tips To Improve Fire Drills
- Sorting Out Today’s Various Training Options
- Business Ethics and Corporate Scandals – Part 1
- Business Ethics and Corporate Scandals – Part 2
- 12 Tips for More Effective Emails
- Management Communication: Back to the Basics, Part One
- Management Communication: Back to the Basics, Part Two
- Management Communication: Back to the Basics, Part Three
- Asset Management
- Managing Hardware Inventories
- Introduction to Hardware Inventories
- Processes To Manage Hardware Inventories
- Use of a Hardware Inventory Database
- References
- Managing Software Inventories
- Business Continuity Management
- Ten Lessons Learned from Real-Life Disasters
- Ten Lessons Learned From Real-Life Disasters, Part 2
- Differences Between Disaster Recovery and Business Continuity , Part 1
- Differences Between Disaster Recovery and Business Continuity , Part 2
- 15 Common Terms and Definitions of Business Continuity
- The Federal Government’s Role in Disaster Recovery
- The 12 Common Mistakes That Cause BIAs To Fail—Part 1
- The 12 Common Mistakes That Cause BIAs To Fail—Part 2
- The 12 Common Mistakes That Cause BIAs To Fail—Part 3
- The 12 Common Mistakes That Cause BIAs To Fail—Part 4
- Conducting an Effective Table Top Exercise (TTE) — Part 1
- Conducting an Effective Table Top Exercise (TTE) — Part 2
- Conducting an Effective Table Top Exercise (TTE) — Part 3
- Conducting an Effective Table Top Exercise (TTE) — Part 4
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part One
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Two
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Three
- The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Four
- The Information Technology Infrastructure Library (ITIL)
- The Origins of ITIL
- The Foundation of ITIL: Service Management
- Five Reasons for Revising ITIL
- The Relationship of Service Delivery and Service Support to All of ITIL
- Ten Common Myths About Implementing ITIL, Part One
- Ten Common Myths About Implementing ITIL, Part Two
- Characteristics of ITIL Version 3
- Ten Benefits of itSMF and its IIL Pocket Guide
- Translating the Goals of the ITIL Service Delivery Processes
- Translating the Goals of the ITIL Service Support Processes
- Elements of ITIL Least Understood, Part One: Service Delivery Processes
- Case Study: Recovery Reactions to a Renegade Rodent
- Elements of ITIL Least Understood, Part Two: Service Support
- Case Studies
- Case Study — Preparing for Hurricane Charley
- Case Study — The Linux Decision
- Case Study — Production Acceptance at an Aerospace Firm
- Case Study — Production Acceptance at a Defense Contractor
- Case Study — Evaluating Mainframe Processes
- Case Study — Evaluating Recovery Sites, Part One: Quantitative Comparisons/Natural Disasters
- Case Study — Evaluating Recovery Sites, Part Two: Quantitative Comparisons/Man-made Disasters
- Case Study — Evaluating Recovery Sites, Part Three: Qualitative Comparisons
- Case Study — Evaluating Recovery Sites, Part Four: Take-Aways
- Disaster Recovery Test Case Study Part One: Planning
- Disaster Recovery Test Case Study Part Two: Planning and Walk-Through
- Disaster Recovery Test Case Study Part Three: Execution
- Disaster Recovery Test Case Study Part Four: Follow-Up
- Assessing the Robustness of a Vendor’s Data Center, Part One: Qualitative Measures
- Assessing the Robustness of a Vendor’s Data Center, Part Two: Quantitative Measures
- Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part One: What Did the Team Do Well
- (d) Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part Two
In this final segment of a four-part series on improving high availability, I present some techniques for assessing the quality of an availability process. These techniques apply to any environment, regardless of size, complexity or types of platforms used.
Assessing an Infrastructure's Availability Process
Over the years, many clients have asked me for a quick and simple method to evaluate the quality, efficiency, and effectiveness of their availability process. In response to these requests, I developed the assessment worksheet shown in Figure 1. Process owners and their managers collaborate with other appropriate individuals to fill out this form. Along the left-hand column are 10 categories of characteristics about a process. The degree to which each of these characteristics is put to use in designing and managing a process is a good measure of its relative robustness.
The categories that assess the overall quality of a process are executive support, process owner, and process documentation. Categories assessing the overall efficiency of a process consist of supplier involvement, process metrics, process integration, and streamlining/automation. The categories used to assess effectiveness include customer involvement, service metrics, and the training of staff.
The evaluation of each category is a very simple procedure. The relative degree to which the characteristics within each category are present and being used is rated on a scale of 1 to 4, with 1 indicating no presence and 4 being a large presence of the characteristic. For example, suppose the executive sponsor for the availability process demonstrated some initial support for the process by carefully selecting and coaching the process owner. However, presume that over time this same executive showed no interest in analyzing trending reports or holding direct report managers accountable for outages. I would consequently rate the overall degree to which this executive showed support for this process as little and rate it a 2 on the scale of 1 to 4. On the other hand, if the process owner actively engages key customers in the design and use of the process, particularly as it pertains to availability metrics and service level agreements, I would rate that category a 4.
Each of the categories is similarly rated as shown in Figure 1. Obviously, a single column could be used record the ratings of each category; however, if we format separate columns for each of the four possible scores, categories scoring the lowest and highest ratings stand out visually. I have filled in sample responses for each of the categories to show how the entire assessment might work. We now sum the numerical scores within each column. In our sample worksheet this totals to 2 + 6 + 6 + 12 = 26. This total is then divided by the maximum possible rating of 40 for an assessment score of 65%.
Figure 1. Sample Assessment Worksheet for Availability Process
Apart from its obvious value of quantifying areas of strength and weakness for a given process, this rating provides two other significant benefits to an infrastructure. One is that it serves as a baseline benchmark from which future process refinements can be quantitatively measured and compared. The second is that the score of this particular process can be compared to those of other infrastructure processes to determine which ones need most attention.
Assessing an Availability Process Using Weighted Criteria
In my experience, many infrastructures do not attribute the same amount of importance to each of the 10 categories within a process. I refined the assessment worksheet to account for this uneven distribution of category significance—I allowed weights to be assigned to each of the categories. The weights range from 1 for least important to 5 for most important, with a default of 3.
Figure 2 shows an example of how this works. I provide sample weights for each of the rated categories from our sample in Figure 1. The weight for each category is multiplied by its rating to produce a final score. For example, the executive support category is weighted at 3 and rated at 2 for a final score of 6. We generate scores for the other nine categories in a similar manner and sum up the numbers in each column. In our sample worksheet, the weights add up to 30, the ratings add up to 26 as before, and the new scores add up to 90.
Figure 2. Sample Assessment Worksheet for Availability Process with Weighting Factors
The overall weighted assessment score is calculated by dividing this final score of 90 by the maximum weighted score (MWS). By definition, the MWS will vary from process to process, and from shop to shop, since it reflects the specific weighting of categories tailored to a given environment. The MWS is the product of the sum of the 10 weights % 4, the maximum rating for any category. In our example, the sum of the 10 weights is 30, so our MWS = 90 / (30 * 4) = 75%.
Your next question could well be this: Why is the overall weighted assessment score of 75% higher than the nonweighted assessment score of 65%, and what is the significance of this? The answer to the first question is quantitative and to the second one, qualitative. The weighted score will be higher whenever categories with high weights receive high ratings or categories with low weights receive low ratings. When the reverse occurs, the weighted score will be lower than the nonweighted one. In this case, the categories of customer involvement, supplier involvement, and service metrics were given the maximum weights and scored high on the ratings, resulting in a higher overall score.
The significance of this is that a weighted score will reflect a more accurate assessment of a process because each category is assigned a customized weight. The more frequently the weight deviates from the default value of 3, the greater the difference will be between the weighted and nonweighted values.
We can measure and streamline the availability process with the help of the assessment worksheet shown in Figures 1 and 2. We can measure the effectiveness of an availability process with service metrics such as the percentage of downtime to users and time lost due to outages in dollars. Process metrics—such as the ease and quickness with which servers can be re-booted—help us gauge the efficiency of the process. And we can streamline the availability process by automating actions such as the generation of outage tickets and the notification of users when outages occur.
Summary
This concludes our series on improving high availability. The series began with a definition of terms, continued with desirable traits of an availability process owner, presented some methods for measuring availability, described seven characteristics of high availability and concluded with techniques for assessing the quality of an availability process, regardless of size, complexity, or platforms used.
References
Schiesser, Rich, IT Systems Management, Prentice Hall, 2002