- What You Will Learn
- The Challenge of Big Data
- Today's Big Data Explosion
- Background for This Book
- Why the Focus on Database Sharding?
- Summary
Today’s Big Data Explosion
Next let’s look at the drivers and scope of Big Data.
Managing and Capitalizing on the Current Data Boom
We are living in the midst of a data explosion, a true boom in databases and database technology, the likes of which the world has never seen. Since you are reading this book, I assume you have an interest in database performance and scalability, the same as I do. Figure 1.1 illustrates the Big Data explosion by the current data boom, and how critical it is for us to be able to extract meaning from all of this data.
Figure 1.1 The Big Data explosion
What do I mean by a data boom? Given that information is often the most valuable commodity in today’s tech-centric world, this means that we as data professionals hold the keys to the kingdom. Not only is data getting more massive each and every year, the rate at which data is being generated is accelerating at a super-linear rate. It’s tough enough to figure out how to capture and manage all that data, but more importantly we need to identify the right set of architectural decisions, tools, and capabilities to allow our organizations to capitalize on that data. In other words, giving meaning to raw data is just as important as the collection, reliability, and management of Big Data environments.
The causes of the explosion of database data are not hard to find:
- The advent of the Internet and the World Wide Web has generated exponential growth in the global user community—users with ever-expanding access to computing power and bandwidth.
- The interaction of these users with Internet applications has resulted in unprecedented levels of data and transaction volumes.
- The shift to online advertising supported by the likes of Google, Yahoo, and others is a key driver in the data boom we are seeing today.
- The overall expansion of the worldwide economy has spurred massive data growth for traditional commerce (e.g., increased airline travel, international purchases, online products, etc.).
- The core social networks (e.g., Facebook, Twitter, LinkedIn, and now Google+), by their very nature, have generated massive new ways for people to communicate and interact, resulting in correspondingly large data sets and transaction volume.
- Many specialized social networks have also arisen—everything from match-making sites to special interest groups, and even “buy-sell” applications that have generated their own micro-economies.
- An entirely new breed of social network applications has been spawned, leveraging the inter-connection of social network users in fascinating ways, driving exponential growth in application volume, again with huge transaction volumes and data sizes (sometimes virtually overnight success stories).
- Web- and advertising-analytics applications abound, crawling and analyzing virtually every aspect of the user interaction described above, again resulting in massive data sets with intense database access needs.
- An entirely new breed of chatter trend analytics applications have emerged, analyzing things like Twitter tweets, Facebook chats, and so on, requiring massive levels of data storage and access.
- Last, the world has gone mobile. In fact, in burgeoning economies and established countries alike, smart phones and tablets are by far the most readily available, high-growth, and commonly used communication vehicle for much of the world’s population, generating a nearly incomprensible stream of data, transactions, application interaction, and messaging volume (with no end in sight).
Any one of the factors above would, in itself, have created unprecedented growth in databases, but taken collectively the impact is truly awe-inspiring and overwhelming. I never cease to be amazed by the number of extremely bright people I meet that have an ever-widening set of ideas—ideas that are building new businesses that are further fueling the growth of databases to support these fast-growing industries and sectors.
Your Role as a Data Architect
For those of us who are data architects, this is the most exciting time ever. Make no mistake about it: You are the key to the success of this new environment. For without the talent and dedication of the individual data architect to find solutions to address today’s data explosion, none of these ventures would succeed.
If you are a database architect or database administrator (DBA)—one who is driven to deliver the fastest and most scalable database platform possible for your organization—this book is addressed to you. In the end, it is your ability and intelligence that will solve these problems for real; all the technology in the world can at best assist you to do the job.
The Acceleration of Big Data Innovation
The phenomenon of the Big Data explosion has not only yielded huge growth in data, but also has spurred ingenious innovation to conquer the Big Data challenge.
The sheer number of new database platforms and DBMS engines introduced in the past few years is mind-boggling. Daunting as it may seem to stay current with this rate of technological advancement, the innovation itself is nothing but good for the professional data architect. This investment in Big Data solutions has yielded an incredibly wide array of technology options for you to consider.
While the sheer rate of innovation has created an environment that is often nothing short of exhilarating (at least for a database fanatic like me), it can at times be intimidating as well.
Therefore, knowing how to evaluate these technologies so you can pick the right options for your application is critical. In fact, it’s a vital skill you (and the companies or organizations you support) cannot afford to be without. A portion of this series is devoted to a review of the various types of data platforms available, explaining typical use cases, available solutions, and how and where to apply them. But more importantly, the underlying concepts—applicable to all DBMS engines—are covered, enabling you to make informed evaluations and decisions for your specific application requirements.
I firmly believe (and have found through hard-won experience) that the more you understand the fundamentals of Big Data platforms and technologies, the better you will be at implementing truly successful database environments, taming whatever Big Data explosion you are in the midst of.