- Alpine-Style Systems Development
- What Is Agile Analytics?
- Data Warehousing Architectures and Skill Sets
- Why Do We Need Agile Analytics?
- Introducing FlixBuster Analytics
- Wrap-Up
Why Do We Need Agile Analytics?
In my years as a DW/BI consultant and practitioner I have learned three consistent truths: Building successful DW/BI systems is hard; DW/BI development projects fail very often; and it is better to fail fast and adapt than to fail late after the budget is spent.
First Truth: Building DW/BI Systems Is Hard
If you have taken part in a data warehousing project, you are aware of the numerous challenges, perils, and pitfalls. Ralph Kimball, Bill Inmon, and other DW/BI pioneers have done an excellent job of developing reusable architectural patterns for data warehouse and DW/BI implementation. Software vendors have done a good job of creating tools and technologies to support the concepts. Nonetheless, DW/BI is just plain hard, and for several reasons:
- Lack of expertise. Most organizations have not previously built a DW/BI system or have only limited experience in doing so.
- Lack of experience. Most organizations don't build multiple DW/BI systems, and therefore development processes don't get a chance to mature through experience.
- Ambitious goals. Organizations often set out to build an enterprise data warehouse, or at least a broad-reaching data mart, which makes the process more complex.
- Domain knowledge versus subject matter expertise. DW/BI practitioners often have extensive expertise in business intelligence but not in the organization's business domain, causing gaps in understanding. Business users typically don't know what they can, or should, expect from a DW/BI system.
- Unrealistic expectations. Business users often think of data warehousing as a technology-based plug-and-play application that will quickly provide them with miraculous insights.
- Educated user phenomenon. As users gain a better understanding of data warehousing, their needs and wishes change.
- Shooting the messenger. DW/BI systems are like shining a bright light in the attic: You may not always like what you find. When the system exposes data quality problems, business users tend to distrust the DW/BI system.
- Focus on technology. Organizations often view a DW/BI system as an IT application rather than a joint venture between business stakeholders and IT developers.
- Specialized skills. Data warehousing requires an entirely different skill set from that of typical database administrators (DBAs) and developers. Most organizations do not have staff members with adequate expertise in these areas.
- Multiple skills. Data warehousing requires a multitude of unique and distinct skills such as multidimensional modeling, data cleansing, ETL development, OLAP design, application development, and so forth.
These unique DW/BI development characteristics compound the already complex process of building software or building database applications.
Second Truth: DW/BI Development Projects Fail Often
Unfortunately, I'm not the only one who has experienced failure on DW/BI projects. A quick Google search on "data warehouse failure polls" results in a small library of case studies, postmortems, and assessment articles. Estimated failure rates of around 50 percent are common and are rarely disputed.
When I speak to groups of business intelligence practitioners, I often begin my talks with an informal survey. First I ask everyone who has been involved in the completion of one or more DW/BI projects to stand. It varies depending on the audience, but usually more than half the group stands up. Then I ask participants to sit down if they have experienced projects that were delivered late, projects that had significant budget overruns, or projects that did not satisfy users' expectations. Typically nobody is left standing by the third question, and I haven't even gotten to questions about acceptable quality or any other issues. It is apparent that most experienced DW/BI practitioners have lived through at least one project failure.
While there is no clear definition of what constitutes "failure," Sid Adelman and Larissa Moss classify the following situations as characteristic of limited acceptance or outright project failure (Moss and Adelman 2000):
- The project is over budget.
- The schedule has slipped.
- Some expected functionality was not implemented.
- Users are unhappy.
- Performance is unacceptable.
- Availability of the warehouse applications is poor.
- There is no ability to expand.
- The data and/or reports are poor.
- The project is not cost-justified.
- Management does not recognize the benefits of the project.
In other words, simply completing the technical implementation of a data warehouse doesn't constitute success. Take another look at this list. Nearly every situation is "customer"-focused; that is, primarily end users determine whether a project is successful.
There are literally hundreds of similar evaluations of project failures, and they exhibit a great deal of overlap in terms of root causes: incorrect requirements, weak processes, inability to adapt to changes, project scope mismanagement, unrealistic schedules, inflated expectations, and so forth.
Third Truth: It Is Best to Fail Fast and Adapt
Unfortunately, the traditional development model does little to uncover these deficiencies early in the project. As Jeff DeLuca, one of the creators of Feature Driven Development (FDD), says, "We should try to break the back of the project as early as possible to avoid the high cost of change later downstream." In a traditional approach, it is possible for developers to plow ahead in the blind confidence that they are building the right product, only to discover at the end of the project that they were sadly mistaken. This is true even when one uses all the best practices, processes, and methodologies.
What is needed is an approach that promotes early discovery of project peril. Such an approach must place the responsibility of success equally on the users, stakeholders, and developers and should reward a team's ability to adapt to new directions and substantial requirements changes.
As we observed earlier, most classes of project failure are user-satisfaction-oriented. If we can continuously adapt the DW/BI system and align with user expectations, users will be satisfied with the outcome. In all of my past involvement in traditional DW/BI implementations I have consistently seen the following phenomena at the end of the project:
- Users have become more educated about BI. As the project progresses, so does users' understanding of BI. So, what they told you at the beginning of the project may have been based on a misunderstanding or incorrect expectations.
- User requirements have changed or become more refined. That's true of all software and implementation projects. It's just a fact of life. What they told you at the beginning is much less relevant than what they tell you at the end.
- Users' memories of early requirements reviews are fuzzy. It often happens that contractually speaking, a requirement is met by the production system, but users are less than thrilled, having reactions like "What I really meant was . . ." or "That may be what I said, but it's not what I want."
- Users have high expectations when anticipating a new and useful tool. Left to their own imaginations, users often elevate their expectations of the BI system well beyond what is realistic or reasonable. This only leaves them disappointed when they see the actual product.
- Developers build based on the initial snapshot of user requirements. In waterfall-style development the initial requirements are reviewed and approved, then act as the scoping contract. Meeting the terms of the contract is not nearly as satisfying as meeting the users' expectations.
All these factors lead to a natural gap between what is built and what is needed. An approach that frequently releases new BI features to users, hears user feedback, and adapts to change is the single best way to fail fast and correct the course of development.
Is Agile Really Better?
There is increasing evidence that Agile approaches lead to higher project success rates. Scott Ambler, a leader in Agile database development and Agile Modeling, has conducted numerous surveys on Agile development in an effort to quantify the impact and effectiveness of these methods. Beginning in 2007, Ambler conducted three surveys specifically relating to IT project success rates.6 The 2007 survey explored success rates of different IT project types and methods. Only 63 percent of traditional projects and data warehousing projects were successful, while Agile projects experienced a 72 percent rate of success. The 2008 survey focused on four success criteria: quality, ROI, functionality, and schedule. In all four areas Agile methods significantly outperformed traditional, sequential development approaches. The 2010 survey continued to show that Agile methods in IT produce better results.
I should note here that traditional definitions of success involve metrics such as on time, on budget, and to specification. While these metrics may satisfy management efforts to control budgets, they do not always correlate to customer satisfaction. In fact, scope, schedule, and cost are poor measures of progress and success. Martin Fowler argues, "Project success is more about whether the software delivers value that's greater than the cost of the resources put into it." He points out that XP 2002 conference speaker Jim Johnson, chairman of the Standish Group, observed that a large proportion of features are frequently unused in software products. He quoted two studies: a DuPont study, which found that only 25 percent of a system's features were really needed, and a Standish study, which found that 45 percent of features were never used and only 20 percent of features were used often or always (Fowler 2002). These findings are further supported by a Department of Defense study, which found that only 2 percent of the code in $35.7 billion worth of software was used as delivered, and 75 percent was either never used or was canceled prior to delivery (Leishman and Cook 2002).
Agile development is principally aimed at the delivery of high-priority value to the customer community. Measures of progress and success must focus more on value delivery than on traditional metrics of on schedule, on budget, and to spec. Jim Highsmith points out, "Traditional managers expect projects to be on-track early and off-track later; Agile managers expect projects to be off-track early and on-track later." This statement reflects the notion that incrementally evolving a system by frequently seeking and adapting to customer feedback will result in building the right solution, but it may not be the solution that was originally planned.
The Difficulties of Agile Analytics
Applying Agile methods to DW/BI is not without challenges. Many of the project management and technical practices I introduce in this book are adapted from those of our software development colleagues who have been maturing these practices for the past decade or longer. Unfortunately, the specific practices and tools used to custom-build software in languages like Java, C++, or C# do not always transfer easily to systems integration using proprietary technologies like Informatica, Oracle, Cognos, and others. Among the problems that make Agile difficult to apply to DW/BI development are the following:
- Tool support. There aren't many tools that support technical practices such as test-driven database or ETL development, database refactoring, data warehouse build automation, and others that are introduced in this book. The tools that do exist are less mature than the ones used for software development. However, this current state of tool support continues to get better, through both open-source as well as commercial tools.
- Data volume. It takes creative thinking to use lightweight development practices to build high-volume data warehouses and BI systems. We need to use small, representative data samples to quickly build and test our work, while continuously proving that our designs will work with production data volumes. This is more of an impediment to our way of approaching the problem rather than a barrier that is inherent in the problem domain. Impediments are those challenges that can be eliminated or worked around; barriers are insurmountable.
- "Heavy lifting." While Agile Analytics is a feature-driven (think business intelligence features) approach, the most time-consuming aspect of building DW/BI systems is in the back-end data warehouse or data marts. Early in the project it may seem as if it takes a lot of "heavy lifting" on the back end just to expose a relatively basic BI feature on the front end. Like the data volume challenge, it takes creative thinking to build the smallest/simplest back-end data solution needed to produce business value on the front end.
- Continuous deployment. The ability to deploy new features into production frequently is a goal of Agile development. This goal is hampered by DW/BI systems that are already in production with large data volumes. Sometimes updating a production data warehouse with a simple data model revision can require significant time and careful execution. Frequent deployment may look very different in DW/BI from the way it looks in software development.
The nuances of your project environment may introduce other such difficulties. In general, those who successfully embrace Agile's core values and guiding principles learn how to effectively adapt their processes to mitigate these difficulties. For each of these challenges I find it useful to ask the question "Will the project be better off if we can overcome this difficulty despite how hard it may be to overcome?" As long as the answer to that question is yes, it is worth grappling with the challenges in order to make Agile Analytics work. With time and experience these difficulties become easier to overcome.