- Why Transactions?
- Terminology
- Application Structure
- Opening the Environment
- Opening the Databases
- Recoverability and Deadlock Avoidance
- Atomicity
- Repeatable Reads
- Transactional Cursors
- Nested Transactions
- Environment Infrastructure
- Deadlock Detection
- Performing Checkpoints
- Database and Log File Archival Procedures
- Log File Removal
- Recovery Procedures
- Recovery and Filesystem Operations
- Berkeley DB Recoverability
- Transaction Throughput
Performing Checkpoints
The second component of the infrastructure is performing checkpoints of the log files. As transactions commit, change records are written into the log files, but the actual changes to the database are not necessarily written to disk. When a checkpoint is performed, the changes to the database that are part of committed transactions are written into the backing database file.
Performing checkpoints is necessary for two reasons. First, you can remove the Berkeley DB log files from your system only after a checkpoint. Second, the frequency of your checkpoints is inversely proportional to the amount of time it takes to run database recovery after a system or application failure.
Once the database pages are written, log files can be archived and removed from the system because they will never be needed for anything other than catastrophic failure. In addition, recovery after system or application failure has to redo or undo changes only since the last checkpoint because changes before the checkpoint have all been flushed to the filesystem.
Berkeley DB provides a separate utility, db_checkpoint, which can be used to perform checkpoints. Alternatively, applications can write their own checkpoint utility using the underlying txn_checkpoint function. The following code fragment checkpoints the database environment every 60 seconds:
int main(int argc, char *argv) { extern char *optarg; extern int optind; DB *db_cats, *db_color, *db_fruit; DB_ENV *dbenv; pthread_t ptid; int ch; while ((ch = getopt(argc, argv, "")) != EOF) switch (ch) { case '?': default: usage(); } argc -= optind; argv += optind; env_dir_create(); env_open(&dbenv); /* Start a checkpoint thread. */ if ((errno = pthread_create( &ptid, NULL, checkpoint_thread, (void *)dbenv)) != 0) { fprintf(stderr, "txnapp: failed spawning checkpoint thread: %s\n", strerror(errno)); exit (1); } /* Open database: Key is fruit class; Data is specific type. */ db_open(dbenv, &db_fruit, "fruit", 0); /* Open database: Key is a color; Data is an integer. */ db_open(dbenv, &db_color, "color", 0); /* * Open database: * Key is a name; Data is: company name, address, cat breeds. */ db_open(dbenv, &db_cats, "cats", 1); add_fruit(dbenv, db_fruit, "apple", "yellow delicious"); add_color(dbenv, db_color, "blue", 0); add_color(dbenv, db_color, "blue", 3); add_cat(dbenv, db_cats, "Amy Adams", "Sleepycat Software", "118 Tower Rd., Lincoln, MA 01741, USA", "abyssinian", "bengal", "chartreaux", NULL); return (0); } void * checkpoint_thread(void *arg) { DB_ENV *dbenv; int ret; dbenv = arg; dbenv_errx(dbenv, "Checkpoint thread: %lu", (u_long)pthread_self()); /* Checkpoint once a minute. */ for (;; sleep(60)) switch (ret = txn_checkpoint(dbenv, 0, 0, 0)) { case 0: case DB_INCOMPLETE: break; default: dbenv_err(dbenv, ret, "checkpoint thread"); exit (1); } /* NOTREACHED */ }
Because checkpoints can be quite expensive, choosing how often to perform a checkpoint is a common tuning parameter for Berkeley DB applications.