The Big Data Ecosystem
In this book, we look at the convergence of five aspects of Big Data, which may at first glance seem to be distinct but in fact are all part of a coalescence of powerful organizations, new technologies, and consumer trends. What are they?
First and most prominent is the familiar consumer technology: the Internet, e-commerce, telematics, social media, and mobile technologies that combine to create a consumer-driven Big Data industry. This is all about entertainment and smartphones and instant messaging. We live it and see it around us every day: The quarrels and the mergers and the Initial Public Offering (IPOs), the money surrounding these tools and toys, and the billions being offered for the latest app. The consumer-driven Big Data industry is helpful, and it’s entertaining, even distracting, and because of that, we have a tendency to trivialize it somewhat, to see it as being not as serious or as economically important as the normal, productive economy, essential to employment and economic growth. But the consumer-driven Big Data industry is not trivial, even if it involves Tweets, photo sharing, and Angry Birds. This is big business, and big money, concentrating our best and brightest minds on advertising, apps, and games—and ever-more-clever ways of capturing enormous amounts of personal data.
The actual technologies are impressive, if not technologically revolutionary. But what is more likely to be revolutionary is that increasingly the companies that dominate in this consumer-driven Big Data economy—Google, Amazon, Facebook, Alibaba, Twitter, Apple, and the myriad of Internet-related startups that support and feed off these companies around the world—will also dominate (or at least greatly influence) the industrial side of the economy in the next decades. For the foreseeable future, economic growth, not only in the developed economies but in the developing world (if that phrase is even appropriate anymore), will be determined by where these big data giants take us. Some may worry that to have so much of our global economic future tied to a handful of gatekeeping technology companies is at least unsettling if not downright scary.
But the consumer side of Big Data that we see every day is only one aspect of the phenomenon. At the same time that consumer-driven Big Data industry is roaring away, another possibly even more important side of Big Data is emerging. It is the industrial side of Big Data—Big Data applied to what is increasingly seen as the “old” economy. This is because the combination of mechatronics and Internet-based technologies is transforming the collection and analysis of business data in a much more traditional and orthodox—but nonetheless important—way. New self-reporting sensors, components, and systems now can feed performance data into ever-more sophisticated enterprise computing systems, making traditional business functions such as sales, accounting, inventory, and logistics much more efficient (and able to operate with many fewer employees). These innovative machine-based data collection and analysis technologies lie behind the expansion of the Industrial Internet and the Internet of Things, and together are causing a parallel (and occasionally overlapping) deluge of digital data generation and collection.
And although the hardware may still be manufactured by the “old economy” powerhouses like GE or Siemens or Erikson, the companies that are likely to make the Internet of Things happen—and to control it and profit from it when it does happen—will be the new and powerful young Turks like Google, Facebook, and Amazon. That’s because their core competency is mining and analyzing Big Data. Those who once worried IBM would be their Big Brother should think again. When the battle for the Internet is over, chances are that IBM and GE will simply provide the supportive infrastructure to help Amazon, Google, and Facebook control the digital data flowing to and from our businesses, homes, cars, and smart phones.
Again, in themselves these industrial Internet Big Data technologies are not revolutionary. My Saab has been alerting me to (a multitude) of ongoing mechanical and electrical failures for six years. Predictive diagnostics are helpful, but they still don’t repair the car. But Google may soon be driving the car for me. And Google will suggest where I should go to have it repaired and how to get there. And the bill will be paid with the Eaze app, activated by my voice command or a nod while wearing Google Glass, after I’ve scanned the QR code through its camera and activated my virtual Apple Pay or Bitcoin wallet to transfer the payment. And while I’m away, Google will adjust my home’s thermostat using Nest technologies while Amazon orders the parts and organizes the repairs, along with my groceries and my dry cleaning. And I will monitor and direct it all from my mobile, running on Google’s Android operating system, which will allow Google to monitor and capture all that activity (including sentiment-scraping the e-mails I send to the service center), and add it to the ever-growing digital profile that it has on me, so that I can receive customized advertising (possibly a coupon from a rival auto service center). Google will then record how I react to that coupon, or if I recommend the service center to social media friends through a “like” button, and then Google will follow that cookie onto the accounts of my friends and similar coupons will appear on their Facebook sites inviting them to come to the service center—and the digital data collection will continue and grow and grow.
This apocryphal story about an aging Saab makes a more serious point. When combined, the convergence of these three Big Data trends—the Consumer Internet, the Industrial Internet, and the Internet of Things—begins to take on new significance.
That is in part because in parallel to these major Big Data trends, another powerful industry has emerged—the digital data collection industry. It consists of the big Internet players like Google, Yahoo!, Facebook, and Twitter, and online retailers like Amazon and Apple. It also includes the majority of major online and offline retail (former and current bricks-and-mortar) stores such as (in the United States) Walmart, Target, and Walgreens, which collect and sell their customers’ personal and transaction data. It also consists of hundreds of online data tracking software and service companies that most people have never heard about but that monitor our everyday online activity, following our digital footprints and selling the data (either in an aggregated or personally identifiable form) to advertisers and employment agencies and debt collectors, and anyone else who will pay for it. And, of course, it also includes the major advertising agencies, and the large data aggregators like Experian, FICO, and Acxiom who had their origins in credit reporting but now maintain colossal databases on the personal and private details of millions of people around the world.
Together, this sometimes competing, sometimes mutually supporting, collective of data-handlers has become a powerful, shadowy, economic force, making a fortune by interpreting and selling consumer-related data and allowing companies to know much, much more about those consumers than they ever thought possible. And the data collectors derive their power from the fact that they have the databases and the tools to control the data everyone wants to get their hands on—to determine how it is distributed, who sees it, and how it is used. They are the drivers, the manipulators, the monetizers of Big Data. They are important—in fact, essential—to the success of a Big Data economy, because they are the ones that spin raw data into the supposed gold of customer-targeted advertising.
The confluence of these four trends has created a frenzy of data production and collection activity—and a surfeit of digital data. In fact, Big Data can mean a Big Data overload for most companies, now enticed by the idea of creating a new profit stream by selling customer-related data on this digital data market and told to collect everything—everything—that they can on their customers and their transactions, and to store all the data in case they might later be able to extract information that could be useful to advertisers or product sales, or to simply sell that customer data to another company. That has become the mantra in companies (and, for that matter, national intelligence agencies) today—collect everything. That means e-mails, credit card numbers, online purchases, what a customer viewed and rejected on web sites, advertisements clicked on or lingered over, customer complaint calls, Tweets that refer to the company or a product, and on and on. As a result of this Big Data market, expectations about the value, ownership, and sanctity of personal data are changing radically.
This takes us on to a fifth feature of the Big Data phenomena: the supporting Big Data technologies that are emerging. To extract value from all that digital data, it needs to be stored, organized, retrieved, and analyzed, and current systems don’t handle large, unstructured data sets very well. As those of us who work in IT know, most companies’ systems are already stretched to the breaking point under the constraints of conventional database technologies.
That means to benefit from all this digital data, all the parties to this Big Data phenomenon need to ensure the success of new Big Data technologies, which include the following:
- New data search and retrieval technologies that allow mining of large and disparate data sets. These come in the form of NoSQL, Hadoop, and MapReduce-type technologies, the same tools that power the Big Data search capabilities for Google, Yahoo!, Facebook, and Amazon.
- Readily accessible data storage technologies, a need that is, in part, prompting the migration toward cloud-based outsourcing and the massive global data storage capacity being built by the Internet powers like Amazon and Google.
- Improved analytical tools that help to sift through the huge quantities and varieties of data to discover important relationships and correlations.
The development of these new technologies is important, because they have allowed the IT industry and venture capital markets to turn their attention away from the continuous, if uneventful, growth in computing power (better number-crunching capacity with ever-more-powerful computing systems) and to focus on Big Data, which is something very different.