Developing enterprise software requires a rich mix of programming and business experience. Application logic must accurately reflect business processes within its domain as well as utilize data access and system resources efficiently.
Take an employee payroll system as an example. Consider a simple batch process that issues reimbursement checks for employee expenses. This process requires the following database operations:
-
Get a list of all employees for whom expense reimbursements are due.
-
For each employee in this list, get a list of active expenses reported.
-
Issue the employee a check for the total.
-
Reset the employee's expense status.
-
Delete the employee's active expense records.
The application logic for this process is straightforward. It does not stray much from the steps listed here. However, its code has the potential to be the opposite. The database code for each of these steps requires multiple physical database operations and management of the corresponding resources. If you mix this code within the application logic, it quickly becomes convoluted.
The following code block illustrates this phenomenon. It implements the employee expense reimbursement process using Java, JDBC, and SQL. Notice the mix of database, technology, and domain details.
Connection connection = DriverManager.getConnection(...); // Get a list of employees that need to be // reimbursed for expenses. PreparedStatement employeesStatement = connection.prepareStatement( "SELECT EMPLOYEE_ID FROM P_EMPLOYEES " + "WHERE EXPENSE_FLAG = ?"); employeesStatement.setString(1, "Reimburse"); ResultSet employeesResultSet = employeesStatement.executeQuery(); while(employeesResultSet.next()) { int employeeID = employeesResultSet.getInt(1); // Get a list of expense records for the employee. PreparedStatement expensesStatement = connection.prepareStatement( "SELECT AMOUNT FROM A_EXPENSES " + "WHERE EMPLOYEE_ID = ?"); expensesStatement.setInt(1, employeeID); ResultSet expensesResultSet = expensesStatement.executeQuery(); // Total the expense records. long totalExpense = 0; while(expensesResultSet.next()) { long amount = expensesResultSet.getLong(1); totalExpense += amount; } // Issue the employee a check for the sum. issueEmployeeCheck(employeeID, totalExpense); // Update the employee's expense status to none. PreparedStatement updateExpenseStatus = connection.prepareStatement( "UPDATE P_EMPLOYEES SET EXPENSE_FLAG = ? " + "WHERE EMPLOYEE_ID = ?"); updateExpenseStatus.setString(1, "None"); updateExpenseStatus.setInt(2, employeeID); updateExpenseStatus.executeUpdate(); updateExpenseStatus.close(); // Delete all of the employee's expense records. PreparedStatement deleteExpenseRecords = connection.prepareStatement( "DELETE FROM A_EXPENSES WHERE EMPLOYEE_ID = ?"); deleteExpenseRecords.setInt(1, employeeID); deleteExpenseRecords.executeUpdate(); deleteExpenseRecords.close(); expensesStatement.close(); expensesResultSet.close(); } employeesResultSet.close(); employeesStatement.close();
Now, scale this implementation style to an entire suite of applications. Having database access code sprinkled throughout application logic makes it especially hard to maintain. One reason is that developers who support and enhance this code must be intimately familiar with both the application logic and data access details. Bigger problems arise when you need to support additional database platforms or incorporate optimizations such as a connection pool. With data access code spread throughout an entire product, these enhancements become major engineering projects that span a majority of the product's source files.
The Data Accessor pattern addresses this problem. Its primary objective is to build an abstraction that hides low-level data access details from the rest of the application code. This abstraction exposes only high-level, logical operations. With a robust abstraction in place, application code focuses on operations from the domain point of view. This focus results in clean, maintainable application logic. Figure 1.1 illustrates how the data accessor abstraction and implementation decouple the application logic from the physical database driver:
Figure 1.1. The Data Accessor pattern decouples application logic from the physical data access implementation by defining an abstraction that exposes only logical operations to the application code.
The data accessor implementation handles all the physical data access details on behalf of the application code. This isolation makes it possible to fix database access defects and incorporate new features in a single component and affect the entire system's operation.
The logical operations that you expose depend on your application's data access requirements. In the employee expense process described earlier, it might be helpful to define logical read and write operations in terms of table and column names without requiring the application code to issue SQL statements or directly manage prepared statements or result sets. The “Sample Code” section in this chapter contains an example of some simple logical database operations.
You can also use a data accessor to hide a database's semantic details as well as constraints that your system's architecture imposes. Here are some ideas for encapsulating physical data access details:
-
Expose logical operations; encapsulate physical operations— The data accessor abstraction can expose logical database operations such as read, insert, update, and delete, instead of requiring application code to issue SQL statements or something at a similar, lower level. The data accessor implementation generates efficient SQL statements on the application's behalf. This is beneficial because it saves application developers from learning the intricacies of SQL and also allows you to change your strategies for issuing these operations without affecting application code.
-
Expose logical resources; encapsulate physical resources— The more details you hide from application code, the more you are at liberty to change. One example of this is database resource management. If you let applications manage their own database connections, it is hard to incorporate enhancements like connection pooling, statement caching, or data distribution later on.
You may find it convenient to provide logical connection handles to applications. Applications can use these handles to associate operations with physical connection pools and physical connection mapping strategies. The data accessor implementation is responsible for resolving exact table locations and physical connections at runtime. This is especially convenient when data is distributed across multiple databases.
-
Normalize and format data— The physical format of data is not necessarily the most convenient form for applications to work with, especially if the format varies across multiple database platforms. For example, databases often store and return binary large object (BLOB) data as an array or stream of raw bytes. The data accessor implementation can be responsible for deserializing these bytes and handing an object representation to the application.
-
Encapsulate platform details— Business relationships change frequently. If your company initiates a new partnership that requires your application to support additional database products, encapsulating any database platform details within a data accessor implementation facilitates the enhancements. If you take this as far as to hide any particular technology, such as SQL, then you can more readily support non-SQL databases as well, all without extensive application code changes.
-
Encapsulate optimization details— Application behavior should not directly depend on optimizations like pools and caches because that hinders your ability to change these optimizations in the future. If you only allow application code to allocate logical resources and issue logical operations, then you retain the freedom to implement these logical operations within the data accessor implementation with whatever optimized strategies are at your disposal.
The Data Accessor pattern makes application code more amenable to enhancement and optimization. In addition, it defines a clear separation between application domain code and data access details. Besides the maintenance issues described throughout this chapter, this separation benefits engineering teams as well, since you can divide the development of different components among multiple programmers with diverse skills and experience.