NoSQL Key-Value Database Simplicity vs. Document Database Flexibility
NoSQL databases are known for their flexibility and performance, but no single type of NoSQL database is right for all use cases. This article discusses two types of NoSQL databases, key-value databases and document databases, and describes how to decide between the two when choosing a database.
Simplicity: Key-Value Databases
Key-value databases and document databases are quite similar. Key-value databases are the simplest of the NoSQL databases: The basic data structure is a dictionary or map. You can store a value, such as an integer, string, a JSON structure, or an array, along with a key used to reference that value. For example, a simple key-value database might have a value such as "Douglas Adams". This value is then assigned an ID, such as cust1237.
Using a JSON structure adds complexity to the database. For example, the database could store a full mailing address in addition to a person's name. In the previous example, key cust1237 could point to the following information:
{name: "Douglas Adams", street: "782 Southwest St.", city: "Austin", state: "TX"}
Key management is vital to smooth operations in a key-value database. Because key-value databases have no SQL-style query language to describe which values to fetch, keys are used to reference data. Some key-value databases compensate for the lack of a query language by incorporating search capabilities. Instead of searching by key, users can search values for particular patterns; for example, all values with a certain string in the name, such as "Douglas". However, freeform search is not available in all key-value databases.
Redis, Riak, and Oracle NoSQL database are examples of key-value databases.
Flexibility: Document Databases
Document databases such as MongoDB and Couchbase extend the concept of the key-value database. In fact, document databases maintain sets of key-value pairs within a document. The JSON example shown earlier is also a document. The term document in NoSQL databases refers to a set of key-value pairs, typically represented in JSON, XML, or a binary form of JSON.
Feature Comparison
If a document database is essentially a key-value database with more features, shouldn't you choose the option with more features and be done with it? Not necessarily.
Document databases organize documents into groups called collections, which are analogous to the tables in relational databases. By contrast, key-value databases store all key-value pairs together in a single namespace, which is analogous to a relational schema.
Key-value pairs of similar types, such as IDs and names, are stored with dissimilar value pairs, such as IDs and customer orders. This sounds like a potential problem, but whether it is depends on the search capabilities of the key-value database. When searching using freeform text to find the ID associated with a given name, you can search across all value types—not just names. However, unless search queries are properly crafted, this approach could end up being less efficient than using a document database.
Document database collections allow developers to apply a high level of organization to their databases. For example, a document database for an e-commerce site might include collections for customers, orders, and products:
- • Customers would include fields such as name, shipping address, and billing address.
- • Orders would include fields such as product and shipping address data.
- • Products would include fields such as department and price data.
Document databases can also provide better performance when working with complex data sets. Separating collection entities by type, such as orders and customer profiles, can help with performance. Similarly, large collections, such as a collection of products, can be partitioned to improve query performance. Partitioning splits collections over multiple servers, allocating a subset of work to each server.
Document databases also support indexing, which can improve query performance by using filter criteria, such as searching for all orders placed in the last 10 days. However, it's often best to limit the use of indexes, confining them to fields that are commonly used in filtering query results. Having too many indexes can slow write operations. In the end, using indexes is a tradeoff. You may get faster query responses at the expense of slower write operations, plus the cost of additional space to store index data.
Making Your Choice
The choice between key-value and document databases comes down to your data and application needs. If you usually retrieve data by key or ID value and don't need to support complex queries, a key-value database is a good option. If you don't need search capabilities beyond key lookup, a key-value database that supports searching may be sufficient. If you have different types of entities and need complex querying, choose a document database.
Both key-value and document databases are sound choices for many database applications. If query patterns and data structures are fairly simple, key-value databases are a good choice. As the complexity of queries and entities increase, document databases become a better option.
Dan Sullivan, author of NoSQL for Mere Mortals, is an enterprise architect and consultant with over 20 years of IT experience, with engagements in advanced analytics, systems architecture, database design, enterprise security, and business intelligence.
James Sullivan is a business technology writer with concentrations in mobile, security, and database services. He is based in Portland, Oregon.