- Introduction
- Let Me Get This Straight, Apache Derby Is IBM Cloudscape?
- Development of the Apache Derby Database—Who Can Contribute and How?
- How Can IBM Sell a Product for Profit and Contribute the Same Product to the Open Source Community?
- How an Open Source Database Like Apache Derby Can Help
- Why the Need for a Local Data Store?
- Why Use a Relational Database?
- How the Apache Derby Platform Can Help Your Business
- A High-Level View of the Apache Derby Database
- The Apache Derby Components
- Developing Apache Derby Applications
Why the Need for a Local Data Store?
There are many reasons why more and more application developers and architects are turning to embeddable or lightweight databases. Before answering "why use a relational engine underneath an application?" which draws you to the Apache Derby platform, let’s first discuss some of the benefits of just having a localized engine to store and manage your data.
Applications sometimes need to persist data, yet operate as an occasionally connected client (OCC). Such applications require intermittent access to the corporate network to refresh, upload, or pull data changes. For example, a traveling insurance agent might fill out reports throughout the day and want to manage the data locally until a time when the data can be uploaded to the corporate server. During the day, however, that user will need fast and efficient local access to data that pertains to his or her region.
Local data stores have become more popular recently for security reasons as well. Having a database available to an application, but not over a network, minimizes many security concerns. Today’s economy is governed by more and more regulatory compliance rules concerning data persistence, such as the Health Insurance Portability and Accountability Act (HIPAA), Sarbanes-Oxley (SOX), the Basel II accord, and others. Another example of this is the challenge that many lines of business are facing with the new corporate reporting requirements in the United States. A lot of reporting is handled and stored in Microsoft Excel files, which might trigger compliance issues. Apache Derby can be used to provide a secure and robust repository for this data (alleviating the issues surrounding file repositories), while providing transparent access for end users who won’t really know that the data isn’t in a file.
In the past, data storage engines were under the complete control of IT departments, which had a number of implications. These engines had to be resourced from IT (which meant jumping through hoops to get a DBA’s attention or resource); you consumed corporate CPU cycles, storage, and more. A self-managing and self-contained database such as Apache Derby has a "rounding error" of significance with respect to IT budgets and skill allocations, so you can avoid that lengthy process altogether.
Geographic isolation might also drive the requirement for a local data store. For example, many of today’s cars have built-in computers that gather information that will subsequently be dumped to a certified mechanic’s computer. These databases are being enhanced in many ways. Computers that collect driving characteristics data will run algorithms and queries that will adjust the performance characteristics of the car (suspension, torque, air/fuel ratios, and so on) to match the owner’s driving patterns in real time.
Another example is Rolls-Royce. Although Rolls-Royce may be well known for its spectacular cars and artisanship (though the reality is that Rolls-Royce hasn’t been making Rolls-Royces since the early 1980s), its real business is producing aero-engines, along with marine engines, power generators, nuclear submarine power plants, and so on. Rolls-Royce attaches drives to some of these engines that literally record gigabytes of information during operation. This data has to be stored somewhere. Sounds like a great opportunity for a database! Take for example an airplane whose engines record data through an entire flight pattern, which then is dumped for analysis to the safety engineer at the destination airport—it makes us feel safe when someone sending our plane on to the next destination knows how that engine performed that whole way!
Having data managed locally can increase performance, deliver better and more convenient access to vital data, and provide many other benefits that we will not cover here.
Estimates point to data collection rates of around 450 megabytes for every man, woman, and child per year, and in the three years prior to 2004, the world collected more data than it did in the last ten thousand years. There are a multitude of factors driving unprecedented data storage requirements, and these data requirements can only be handled by a relational database management system (RDBMS).