Challenges with Business Analytics
Optimization modeling and heuristic tools have yet to make the transition into the Big Data era. In an October 2013 issue of the Wall Street Journal, John Jordan of Penn State University describes several challenges involved when implementing business analytics.17 He notes that there is “a greater potential for privacy invasion, greater financial exposure in fast-moving markets, greater potential for mistaking noise for true insight, and a greater risk of spending lots of money and time chasing poorly defined problems or opportunities.” This section discusses some of the challenges associated with prescriptive analytics and offers some practical recommendations on how to avoid them.
Lack of Management Science Experts
The everyday use of mathematical modeling and other techniques requires that business managers or other practitioners have a good understanding of numeracy and mathematical skills. However, there is a lack of such skills, especially for medium-sized or small organizations. It is estimated that by 2018, U.S. universities and other educational institutions will need to produce between 140,000 and 190,000 more graduates for deep analytical talent positions and 1.5 million more data-savvy managers.18
Business analytics, in general, and prescriptive analytics, in particular, can become more “popular” with the use of spreadsheet modeling. Spreadsheet modeling is widely used in colleges and universities for teaching mathematical programming. Instead of heavy modeling, which seeks optimal solutions, spreadsheet modeling techniques include simpler formulations, which seek practical solutions. However, the spreadsheets have limitations in the amount of data they can store. They cannot store data about millions of transactions in a bank or the details of federal spending on transportation projects, even for a week. There is a time for organizations to introduce more advanced tools.
Analytics Brings Change in the Decision-Making Process
The goal of prescriptive analytics is to bring business value through better strategic and operational decisions. At the strategic level, those who make decisions about what models to implement and what needs to be measured will accrue more power. At the operational level, the implementation of such models brings a power shift in the decision-making process. Information-based decisions across organizational boundaries can upset traditional power relationships.
The story of the Oberweis Dairy 19 is an excellent example of how data analytics can transform organizations. The company started as an Illinois farmer selling his surplus milk to neighbors in 1915. Today, the company has “three distribution channels: home delivery, with thousands of customers; retail, with 47 corporate and franchise stores; and wholesale, to regional and national grocery chains like Target.” The usual approach to decision making at the company is by asking executives to figure out the best configuration for future changes.
In 2012, the company was trying to expand its operations geographically, and the chief executive officer (CEO) asked a data analytics executive with only three years of experience to join the strategy table. Bringing analytics to the table changed the preconceived notion about the customers. Although there were current customers who benefited from the offerings of the company, data analysis indicated that the company had spent a lot of money to acquire customers who should not have been approached in the first place. Contrary to the company’s conventional wisdom, data from the customer sales indicated that “the so-called Beamer and Birkenstock group—liberal, high-income, BMW-driving, established couples living leisurely lifestyles” was not a good fit for the dairy farm. So, the meeting changed from a tactical meeting with a focus on “how many trucks and transfer centers would be required” into a strategic “define the target market” meeting. The change is cultural, and it has grown to a point now where people want to acquire a better understanding of analytics tools because they can see that there is real benefit.
Big Data Leads to Incorrect Information
Modeling with business analytics is more an art than a science. One fundamental step when building mathematical models is the process of abstraction. Through this process, the modeler eliminates or suppresses any unnecessary details and allows only the relevant information to enter the model. When good information goes in the model, a good model produces good results. The opposite is known as GIGO (garbage in, garbage out). In the era of Big Data, it is significantly more difficult for the data analyst to mine in the mountains of information and find the relevant pieces.
Very often, valid models produce poor results, which lead to the wrong decisions. In the era of Big Data, this happens very often. A recent story 20 reports how ten volunteers checked the accuracy of their information on AboutTheData.com and they each found inaccuracies. In one specific case, a volunteer found that “she had two teens, at 26.” Interestingly, a CNN team found that Acxiom, the company that runs the database, was more accurate specifying the interests and less accurate in demographic data (marriage status, number of children). Wrong assumptions can lead to wrong decisions. If you are a company purchasing this database, you know the interests of your future customers, but very likely you may be sending out 2 million direct mails pieces on baby products to people who may not even have children.
Big Data Demands Big Thinking
Business organizations are just entering the new paradigm of Big Data. They have been using standard databases for more than three decades and have accumulated experience and knowledge. However, Big Data demands new techniques and many of them are still in the developmental stages. Acquiring the new tools requires a radical change in underlying beliefs or theory—they require a new way of thinking. It requires, for example, that more people think probabilistically rather than anecdotally. It also requires that managers learn to focus on the signals and do not get lost in the noise.21 This way, organizations will be able to better understand the factors behind customers, products, services, and how to make analytical decisions.
What is Big Data? Big Data is a combination of structured, in-house operational databases with external databases, with automatically captured and often nonstructured data from social networks, web server logs, banking transactions, content of web pages, financial market data, and so on. All this data, coming from a wide variety of sources is combined into non-normalized data warehouse schema. Big Data is usually characterized by three Vs: volume, velocity, and variety:1 22
Volume—Today, the high volume of business transactions is automatically captured by advanced enterprise information systems. Nonstructured and external databases also produce large amounts of data. These sources are then combined into denormalized data warehouses. Unnormalized (or denormalized) data means high-volume data with intentional redundancy. The volume of Big Data is larger than the volume processed by conventional relational databases.
Descriptive and predictive analytics benefit from the high volume of data. After all, statistical analysis and reliability of predictions is better when the population size increases. A forecasting method with hundreds of factors can predict better than the one with only a few input factors. Prescriptive models also benefit from Big Data. They are based on aggregated inputs, which are the result of descriptive analytics: contribution coefficients, average processing times, mean of distributions, and so on. The validity of these aggregate values improves with high-volume data.
Stochastic models can benefit from high-volume data, as well. Statistical distributions are more reliable when fitted with a large number of data points. A prescriptive model, which assumes a normal distribution for processing times, is more reliable when the mean and standard deviation of the normal distribution is based on thousands or millions of data points as compared with only hundreds of data points.
Velocity—Velocity refers to the rate at which data flows into an organization. Online sales, mobile computing, smartphones, and social networks have significantly increased the information flow for the organization. Organizations can analyze customer behavior, sales history, and buying patterns. They are able to quickly produce operational business intelligence and recommend additional purchases or customized marketing strategies. The velocity of system output is also important. The recommendations must be delivered in a timely manner and must be included as part of business operations. A loan officer, for example, could compare the information in a loan application against business rules and mining models, and make a recommendation to the applicant or make a decision about the loan.
Prescriptive modeling techniques can take advantage of velocity. They can be modeled to run in the background and take data from input to make optimal or near-optimal decisions.
- Variety—Variety of data refers to the mix of different data sources in different formats. As mentioned earlier, Big Data input may arrive in the form of a text from social networks or an image from a camera sensor. Even when the data source is structured, the format can be different. Different browsers generate different data. Different users may withhold information, or different vendors may send different information based on the type of software they use. Of course, every time humans are involved, there may be errors, redundancy, and inconsistency. Management science models require the input data to be uniform. As such, the implementation of these models in the era of Big Data normally requires an additional layer between the source data and the prescriptive model.