Lessons Learned in Defect Triage
- Getting the Right People in the Room
- Managing the Process
- Getting the Message Out
- Using Metrics to Improve the Process
- Where to Go from Here
On a recent project, I joined a team that was struggling to deliver reliable software on time. The team consisted of around 20 developers and testers, mostly collocated, and was going through a lot of changes at the time I came on board. The biggest change was a switch to Scrum as the team's development process. In the new process, the programmers and testers would work from two sources: New features would be developed from the product backlog, and defects and issues would be worked from the defect triage process.
Because we changed that process, we were required to change many processes. We knew immediately that we needed to tackle the way in which we managed defects. It was a mix of different project databases, different ticket workflows, different release schedules—and no formal way to track it all. In short, we had a mess.
Major, sweeping process changes, while ideal in a situation like this, can be painful and time-consuming. We didn't have the time to stop, restructure everything, and train everyone on a new process. We had a large number of commitments, some of which were already behind schedule, and had to keep everyone focused on delivery. Therefore, we decided to start planning small incremental changes that we could roll out over time. While that was going on, we would implement defect triage meetings on a project-by-project basis. This article describes how we made those defect triage meetings effective.
Getting the Right People in the Room
Two primary factors drove our team's triage process:
- Ticket priority (How important is it?)
- Ticket severity (How much pain does it cause?)
From time to time, estimates for how long a fix might take could play a role. Other factors also came into play on occasion: a deadline commitment, dependency on another issue, availability of a resource with specialized knowledge, work already taking place on similar or related issues, etc. To define priorities and severities efficiently, we needed to get the right people in the room. For this team, those people consisted of the following:
- The various project managers for each of our clients
- The technical leadership for the development team
- Select representatives from the quality assurance team
One area of struggle was in finding representation for some of our "faceless" stakeholders: internal end-users and the technical operations department. Both of those teams had a hard time getting regular attendees at the meetings. From time to time, we would ask quality assurance team members to act as representatives for those missing stakeholders.
The goal in selecting the audience is to keep it small (no more than 5–10 people), but with enough people to make the right decisions. We also wanted to build a team of people who understood the deadlines and commitments, knew how to assess severity and impact, and could estimate at a high level the work that was needed to handle an issue. The team we assembled would determine which issues were fixed first, and had to have enough credibility and authority to ensure that decisions made at the meetings were carried out.
Once we knew who the right people were, we set a meeting schedule that worked for everyone. If you hold meetings too often, people won't attend. If meetings are too infrequent, the team won't be effective. We decided to meet four days a week, alternating our meetings between client-facing issues and platform-facing issues. Because each of these sets of issues pulled a slightly different audience, some people needed to meet four times a week, and others needed to attend only two meetings a week. We also found that scheduling the meeting at the same time each day reduced confusion.