- 1.1 Big Data Basics
- 1.2 What Is Different?
- 1.3 What Does It Mean?
- 1.4 Transformations
- 1.5 Data-Driven Supply Chains
1.3 What Does It Mean?
The bottom line is that big data analytics enables converting information into an unprecedented amount of business intelligence. It allows companies to precisely understand what happened in the past and why, and better predict the future. The result is a superior competitive capability.
1.3.1 Big Business Intelligence
Not using big data analytics in today’s business world is akin to making deliveries on horseback while competitors are using a truck. Or using carrier pigeons while everyone else is using airfreight. A company just can’t compete without it.
What can big data analytics tell us?
The size of data makes it possible to spot connections and details in the data that are otherwise impossible to spot. Granular analysis of subcategories and submarkets is enabling the understanding of customers and markets unlike ever before. A task that used to be accomplished based on traditional marketing tools—such as focus groups and surveys—to try to determine what the customer wants is now computed based on scientific methods. We no longer guess or use hunches. We know exactly what each customer wants to buy. This changes the way we sell products. This information also drives entire supply chains.
The large amounts of data enable the establishments of norms in the data. This also helps identify data that is outside of the norms—namely outliers. This technique, for example, is what we see today in the identification of credit card fraud. The algorithm automatically detects a change from the norm. This is only possible when there is a large amount of data. This has huge applications any time we are looking for a deviation. This is what UPS does, for example, when it monitors its vehicle fleets for preventive maintenance.22 The company cannot afford a breakdown in its fleet of vehicles. Before big data analytics, the company routinely replaced parts, but that was wasteful. Then in 2000, the company embarked on a program of using sensors to capture data on vehicle performance and notice deviations that indicate a time to intervene before a breakdown.
What else can big data analytics do?
1.3.2 Predicting the Future
Predictive analytics uses a variety of techniques—such as statistics, modeling, and data mining—to analyze current and historical facts to make predictions about the future.23 This is one of the most significant aspects of big data analytics. It is the ability to foresee events before they happen by sensing small changes over time. For example, IBM’s Watson computer uses an algorithm to predict best medical treatments24 and UPS uses analytics to predict vehicle breakdown.25 By placing sensors on machinery, motors, or infrastructure like bridges, we can monitor the data patterns they give off, such as heat, vibration, stress, and sound. These sensors can detect changes that may indicate looming problems ahead—essentially forecasting a problem.
Things do not break down all at once. There is a gradual wear and tear over time. In the past, our technology, sensors, and analytics were not sophisticated enough to detect these changes. Today, armed with sensor data, correlation analysis, and similar methods, we can identify the specific patterns that typically crop up before something breaks. This may be the sound of a motor, excessive heat from an engine, or vibration from the bridge. In health care, it may be changes in a patient’s vitals before the onset of disease. Google is famous for identifying location and propagation of the flu by simply tracking the volume and type of queries in its search engine.26
1.3.3 Fewer Black Swans
Black swans is a term used to describe high-impact, low-probability events.27 Historically, we assumed that these could not be predicted. However, with big data analytics, that is rapidly changing. With big data analytics, the number of events that we used to consider unpredictable and purely random is getting smaller. We are now able to identify and spot changes in systems that indicate potential failure. Just consider the accuracy of the prediction of hurricane Sandy provided by the NOAA weather satellite. Only a few years earlier, this type of event would have been considered a black swan.
Spotting the abnormality early on enables the system to send out a warning so that a new part can be installed, preparation before an impending tsunami can be made, or the malfunction can be fixed. The aim is to identify and then watch a good proxy for the event we are trying to forecast, and thereby predict the future. External events—such as weather, traffic, or road construction—can be tracked and the supply chain can respond. Traffic can be rerouted or knowledge of outbreaks of flu can be used to determine which areas may need more supplies of ibuprofen, chicken soup, or cough drops. This ability is a game changer for risk management.
1.3.4 Explain What Has Happened
One of the most powerful analytics tools we can use on big data is correlation analysis. Correlation analysis has been used for decades. What is different today are the insights obtained when applied to the huge amounts of data. Correlations tell us whether there is a relationship between any set of variables. It doesn’t tell us why there is a relationship. In the world of statistics, it is “quick and dirty” but offers important insights.
Correlations let us analyze a phenomenon by identifying a useful proxy for it. The idea is that if A often takes place together with B, we need to monitor B to predict that A will happen.
Consider the case of Target and identification of pregnant customers.28 Big data analytics was able to identify the precise purchasing bundle associated with a female customer in the second trimester of pregnancy. Those who have seen the highly publicized story might recall the father of a 16-year-old girl who was very angry at Target for sending his daughter baby coupons—only to discover that indeed she was pregnant. The analytics perfectly targeted her—no pun intended.
Correlation analysis also points the way for causal investigations, by telling us which two things are potentially connected. This then tells us where to investigate further. This provides information on where to go into modeling, causation, and optimization. This is an important benefit. It points us in the right direction and enables us to know where to dig deep with more sophisticated analytics applications such as supply chain optimization.
1.3.5 Explain Why Things Happen
Correlation analysis tells whether a relationship exists. More advanced statistical applications enable us to go beyond understanding whether a relationship exists and delve deeper into understanding causations.
1.3.5a Supply Chain Optimization
Supply chain optimization is the application of mathematical and statistical tools to develop optimal solutions to supply chain problems. This enables analysts to create models to simulate, explore contingencies, and optimize supply chains. Many of these approaches employ some form of linear programming software and solvers. This allows the program to maximize a particular goal given a set of variables and constraints. This includes the optimal placement of inventory within the supply chain, minimizing the carbon footprint or minimizing operating costs, such as manufacturing, transportation, and distribution costs.
1.3.5b Randomized Testing
Big data and analytics have enabled companies to use randomized testing to conduct experiments to “test and learn”—sometimes called design of experiments. Randomized testing is a statistical method that involves conducting, analyzing, and interpreting tests to evaluate which factors impact variables of interest. For example, this might be asking whether planned changes in delivery or store layouts will increase customer purchases. Randomized testing is at the heart of the scientific method. Without random assignment to test groups, and without a control group, it is impossible to know which improvements are actually due to the changes being made. This type of large-scale testing is now possible as there is lots of data to compare and analyze.
Another significant enabler is that many current software applications are designed for people with little statistical training. New software makes it possible to conduct design of experiments by businesspeople rather than professional statisticians. For example, testing alternative versions of Web sites is relatively straightforward. This type of testing is simple and is becoming widely practiced in online retailers. Simple A/B experiments, such as comparing two versions of a Web site—A versus B—can be easily structured. The online retailer eBay, for example, routinely conducts experiments with different aspects of its Web site.29 The site generates huge amounts of data as there are more than a billion page views per day. This enables eBay to conduct multiple experiments concurrently and not run out of treatment and control groups. Similarly, the North Carolina food retailer Food Lion uses testing to try out new retailing approaches—again simply comparing A versus B.30 This ranges from comparing new store formats to simple tactical decisions.