What Is Analytics?
The term analytics has been used for a long time. So what has made its use so popular now?
The internet companies of Silicon Valley have helped. They use the term analytics to refer to keeping track of who is clicking on your website, which pages they visit, what they buy, and so on. But it is hard to imagine that the term has spread like it has if it’s just about tracking website performance.
Some companies selling reporting systems (or business intelligence software, if you want to use the industry term) have also helped. These companies claim that analytics is the ability to report on your data in easier and more powerful ways. But again, the term wouldn’t be so popular if it were just about reporting.
We believe that the term entered the mainstream business vernacular when Harvard Business Review published the article, “Competing on Analytics,” by Thomas Davenport in January 2006 (and then a book by the same title).3 In this article, Davenport highlights how companies like Amazon, Marriott, Harrah’s, and Capital One “have dominated their fields by deploying industrial-strength analytics across a wide variety of activities.”
This was a tipping point for the term analytics. The article really shows that you can apply analytics to a wide range of problems. And it shows that it isn’t just a niche area (like tracking website visitors or creating better reports). It’s bigger than that.
But we still haven’t actually defined what exactly analytics is. The Davenport article gives examples of firms solving specific problems. For example, Marriott uses analytics to set the optimal price for rooms, and Capital One uses it to analyze experiments with different prices, promotions, and bundled services to attract the right customers. But the article only defines analytics as the ability to “collect, analyze, and act on data.” In other words, according to this article, analytics is about using data to make better decisions. This definition, while correct and compact, does not give much guidance. Haven’t managers always talked about using data to make decisions? They have, but there is now much more data available. Simply using more data to make more decisions doesn’t really help us define analytics.
The contribution of the Davenport article isn’t that it defined analytics. Rather, the article helped create the analytics movement. That is, it introduced the idea that a lot of people are solving a lot of different problems using a lot of different tools—and that all these tools are being called “analytics.”
The article sparked the idea of analytics as a field (like the field of computer science or of chemistry). As proof of this, many universities have started to offer degrees in analytics. Professional organizations dedicated to data analysis, like INFORMS, have also started to shape the field.4
What has emerged from academic and serious business thinkers is a definition of analytics that categorizes the different objectives when using data to make better decisions.5 We will stick to this emerging definition of analytics throughout this book. So, here is the definition of analytics:
- Analytics is the collection of disciplines that use data to gain insight and help make better decisions. It is composed of descriptive analytics to help describe, report on, and visualize the data; predictive analytics to help anticipate trends and identify relationships in the data; and prescriptive analytics to help guide the best decisions with a course of action given the data you have and the trends you expect.
Another way of looking at this is to say descriptive analytics aims to provide an understanding of what happened or is happening, predictive analytics aims to tell you what will or may happen next, and prescriptive analytics aims to tell you what you should do. Each of these areas of analytics can be broken down further—which we do later in the book—and different tools and techniques are applied to each.
This definition of analytics will hold up over time. It is specific enough to give meaning to the term, while broad enough to allow for future development. New subcategories are likely to come into existence, new tools will surely be developed, and people will gain new insights. This definition gives you a way to understand those changes and how they fit in.
This book dives into each of these areas of analytics to give you more insight. Keep in mind that although we may cover a certain subcategory of analytics or a single tool in just a couple paragraphs, there could be entire university departments, professional organizations, and companies dedicated to just that subcategory or tool. We are by no means understating the importance of that topic. Rather, our goal is to provide you with enough insight so that you can better understand the field of analytics.
To help understand the types of analytics further, let’s explore a few examples.
Examples of Descriptive Analytics
A great early example of descriptive analytics comes from John Snow’s work during London’s 1854 deadly cholera outbreak. The data on where people were dying was readily available but wasn’t helping anyone contain the outbreak. But Snow decided to plot the deaths on a map to see if he could get additional insight (see Figure 1.1). In this map, the small black dots represent the residences where people died from the disease. When Snow and others who lived in London at the time looked at the map, they could see clearly that the deaths were centered around a certain water pump.6 (As noted in the “Endnotes” section, a modern version of this map, creating with modern mapping tools, shows the data even better.) Looking at the data this way helped narrow the search for the source of the cholera outbreak to a particular water pump.
Figure 1.1 John Snow’s 1854 Cholera Map
Of course, in reality, it took quite a long time to convince the skeptics that the water pump was the source. But, in the end, Snow was correct, and his map played an important part in convincing people. Some even say that this was the start of the field of epidemiology (the science of studying the patterns and causes of diseases). In this case, Snow visualized data in a new way, which led to a much greater understanding and helped convince others. This is the power of descriptive analytics.
Interestingly, Snow’s idea is being applied in a modern way in Lahore, Pakistan, to help prevent the spread of the mosquito-borne disease dengue fever. The city health team is using a smartphone app to record the presence of infected mosquito larvae and plotting this information on a map along with known infections. This information describes the extent or potential for dengue fever, and officials can use it to help determine where to spray. It is a clever update of Snow’s idea and shows the power of geographic visualization.
The case of Circle of Moms7 shows that a little descriptive analytics can lead to big strategic decisions. There was a Facebook application originally called Circle of Friends that had gone viral shortly after it was created and had grown to 10 million people signed up by 2008. Although it looked like they had a potentially popular social website, the founder and CEO realized they had a problem. Almost no one was actually using the application. The founder knew he needed engaged users—not just a lot of people signed up—to have a company with real value. Simple descriptive analytics came to the rescue. The founder began to spend time looking at their data on the users they did have. When he investigated the data, he found that one subgroup was using the application much more than others: moms. The moms were more engaged, had longer conversations, posted pictures, and did other things that indicated they were a happy community group. So the Circle of Friends management team decided to modify the entire business and make it Circle of Moms. Having the ability to look at data and understand it in new ways can lead to big strategic decisions like this one.
Examples of Predictive Analytics
You often interact with predictive analytics through consumer websites, although you may not have known it. Netflix and Amazon are now widely known for their predictive analytics. Based on your ratings (or your purchase and viewing history) and the ratings of others, Netflix and Amazon recommend additional merchandise that you will probably like. It is a bit difficult for an outsider to know how valuable a good prediction system is to these companies. However, we know that Netflix must believe it is pretty important, based on the fact that they conducted an entire contest (available to anyone except Netflix employees or close associates) solely designed to create an improved recommendation algorithm. In 2009, the winning team, which was able to demonstrate a 10% better recommendation system, was awarded the $1 million prize.
Many websites are running tests to predict what designs will work better. In 2008, for example, the Obama campaign was designing its website so that it could maximize the number of people who signed up to be on its email list. Once they were on the email list, people could be contacted for donations or to volunteer. The design team had a choice of several different buttons with different labels (like “Join Us Now,” “Learn More,” and “Sign Up”) and about six different pictures or videos. Instead of arguing about which combination was best, they ran a test. They were getting enough hits to their website that they could randomly assign a visitor one of the different combinations and then track what happened. They ran these tests against a control (the existing design) to predict whether the change would have an impact. In the end, they found a combination that they claimed led to a 40% increase in the number of people who signed up to be on the email list. They claimed that this led to a big increase in donations and volunteer hours.8
Examples of Prescriptive Analytics
To borrow an example from the book The Optimization Edge,9 you interact with prescriptive analytics when you use a GPS system or an online map to get directions. You enter the start and end points, and then the program tells you (or prescribes) how to get there. Most people don’t know that mathematical optimization algorithms are what allow this to happen.
Although you might not think about it, matching kidney donors and recipients is also a prescriptive analytics problem.10 A person needs only one healthy kidney. Therefore, a healthy donor can donate one healthy kidney to someone who has no functioning kidneys. The alternatives are dialysis and getting a kidney from a deceased person. Waiting for a deceased donor can take a long time, though. Also, the quality and length of life are much better if a recipient receives a kidney from a live donor. Knowing this, a person in need of a kidney can sometimes find a family member or friend who is willing to donate a healthy kidney. However, the problem is that there is a good chance the kidneys won’t be compatible; just because the donor and recipient are family members or friends doesn’t mean they will be a match. Say that you need a kidney, and your brother is willing to give you one, but you’re not a match. If you both get into a database of donor–recipient pairs, you can be matched with a compatible donor, and your brother can be matched with a compatible recipient. You don’t actually get your brother’s kidney, but he gives up one of his kidneys so you can get one from someone else.
To make a kidney donor–recipient database work, matching organizations use mathematical optimization techniques to prescribe matches. That is, they look at all the possible combinations of matches and pick the best ones that match up the most people and provide the highest possible compatibility matches. The mathematical optimization engine makes this possible. Without it, looking at all the combinations would be impossible.11 Optimization helps connect many more people and improve many more lives than otherwise would be.
The following examples show companies using all three types of analytics—descriptive, predictive, and prescriptive—to improve their efficiency. As you’ll see with both the DC Water and Coca-Cola cases, a lot of the value of analytics results from combining different types of analytics together to come up with a solution that just wasn’t possible several years ago.
An Example Using Descriptive, Predictive, and Prescriptive Analytics: DC Water12
DC Water, the water company for the Washington, DC, area, serves more than 2 million customers with several thousand miles of pipes and maintains nearly 10,000 fire hydrants. The average age of the pipes is over 75 years. DC Water documented its efforts with IBM to go from a company that used mostly paper records and limited data to an organization that improved performance with descriptive, predictive, and prescriptive analytics. This case highlights how each subcategory of analytics offers value and that they can all complement one another.
By using descriptive analytics, DC Water mapped the location of all the city’s fire hydrants. Just visualizing the hydrants allowed the company to create better maintenance plans; before it did, it had been difficult to make sure every hydrant was being properly maintained. DC Water also added extra sensors to water pipes to better monitor water usage and look for anomalies.
With so many aging pipes, failures were a big problem. In the past, DC Water simply reacted to the failures. With predictive analytics, the company could now look at factors such as the age of the pipe, soil conditions (gathered from better descriptive analytics), pressure on the pipes, nearby problems, and other factors and use statistical models to predict when pipes would fail. This allowed DC Water to address potential issues before they became actual problems. It also helped the company prioritize preventive maintenance.
By using prescriptive analytics, DC Water could better route maintenance crews to fix trouble tickets, which increased the productivity of the maintenance team while driving down fuel cost. By knowing the location of each hydrant, existing work order, or preventive maintenance project, DC Water could better route the trucks and crews to the best locations.
An Example Using Descriptive, Predictive, and Prescriptive Analytics: Coca-Cola Orange Juice Plant13
Businessweek published an article on Coca-Cola’s new state-of-the-art orange juice plant. The goal of the plant was to produce high-quality and consistent orange juice year-round. The article showed how Coca-Cola was using analytics to accomplish this.
As an example of descriptive analytics, Coca-Cola used satellite images of the orange groves in Brazil to determine when different fields were ready for harvest. By being able to see the orange groves in a new way, Coca-Cola was able to better harvest the oranges for increased quality.
Coca-Cola used predictive analytics to predict the quality of oranges coming from different fields by looking at different weather patterns. This helped determine what types of oranges would arrive and whether the company would have to acquire oranges from other locations (if a particular location was likely to have a low-quality harvest). Coca-Cola also analyzed the chemical components of oranges to predict the taste and the range where people could detect a difference. For example, to make up numbers to illustrate a point, Coca Cola would need to determine whether the sweetness range needed to be between 100 and 200 or between 155 and 175 so they could maintain consistency.
Finally, Coca-Cola used prescriptive analytics to determine how to blend the oranges of different quality and characteristics to get the desired output. The mathematical optimization looks at all the potential oranges available and comes up with the mix that gets exactly the right combination of all the different chemical characteristics at the lowest cost.