ORACLE Cache Recovery Tuning
Tuning ORACLE for recovery is about configuring and managing checkpointing. The goal in recovery tuning is finding the right balance between checkpointing at a sufficient rate such that the number of redo log records that need to be applied during a recovery is bounded, and sustainable performance meets business goals. Clearly, a system could be tuned to checkpoint aggressively, but the additional I/O load (writes) imposed by such tuning could impede performance significantly (for example, in a write-intensive workload environment). Note that the number of redo records used has an application dependency, depending on the number of data blocks affected. In some cases, where there are multiple updates to the same block, keeping things in memory could have a write-cancellation effect. The primary concern for maintaining acceptable production performance is avoiding situations where a surge of log flushing induces transaction latency due to pressure on the I/O subsystem.
ORACLE Checkpoint Parameters
ORACLE 8.0.X releases provide some level of checkpoint management by allowing the administrator to force the instance to checkpoint when a predefined threshold is reached, by specifying it in log_check_interval and log_checkpoint_timeout. The log_checkpoint_interval is set in blocks and causes a checkpoint to occur when the log writer (LGWR) ORACLE process writes the number of blocks set in this parameter. The log_checkpoint_timeout is set in seconds and determines the maximum time between checkpoints. (Both of these settings are ORACLE 8.0.x specific behavior.)
Note that if both parameters are set, ORACLE maintains checkpointing to meet the more aggressive of the two parameters, which of also course satisfies the least aggressive.
In addition to checkpoint tuning, recovery time can be managed through the parameter recovery_parallelism. Beginning with release 8.0.4, the parameter parallel_execution_message_size is available. (In releases prior to 8.0.4, the message size was fixed.)
The recovery_parallelism parameter defines the number of recovery slave processes that are started during cache recovery. On multiprocessor systems, it is beneficial to set this value to at least the number of processors installed. This setting allows for concurrency in redo log processing.
The parameter parallel_execution_message_size defines the size of messages passed between recovery slaves. (This parameter affects other parallel features such as parallel query.) The default value of 2048 (2K) is typically too small. Increasing this value can significantly improve recovery time.
As mentioned previously, ORACLE 8.1 added fast start checkpointing and the fast_start_io_target parameter. Fast start, or incremental checkpoint, does not write every modified buffer to disk. Instead, some subset of the total number of modified buffers is written, based on current configuration parameters and tuning.
The fast_start_io_target parameter is defined as the maximum number of blocks that would need recovery. Because random disk I/O dominates roll forward, it is assumed that there is a physical disk I/O associated with each disk block. Thus, this parameter should correlate to the random I/O capabilities of the system being tuned. For example, if the storage subsystem is capable of 10,000 random IOPS, and the recovery goal is about 300 seconds (5 minutes), set fast_start_io_target to 300*10,000 (3,000,000). This is not precise, because recovery time has other components that must also be considered; other considerations are covered in "Best Practices" on page 19.
Beginning with ORACLE 8.1, the meaning of log_checkpoint_interval and log_checkpoint_timeout changed. The log_checkpoint_interval parameter defines the maximum number of redo log blocks that need to be read during recovery, by defining the maximum number of blocks that can exist between the current redo log position and the end of the log (log tail).
The log_checkpoint_timeout sets a boundary on the time required to read redo blocks during instance recovery, by setting the checkpoint target location in the log (the location that, when hit, initiates a checkpoint) relative to the end of the log, so that the end of the log is log_checkpoint_timeout seconds ago.
ORACLE 9 deprecates fast_start_io_target by adding fast_start_mttr_target. The idea is that you set your recovery requirement in seconds rather than disk blocks or disk IOPS. ORACLE maintains recovery timings and startup heuristics for maintaining the redo log size, so that the fast_start_mttr_target can be met (ORACLE reports an error if fast_start_mttr_target is set to a value that ORACLE determines it can not meet).
NOTE
The fast_start_mttr_target parameter specifies total instance startup time, not just the roll-forward phase. If you have a requirement of 5 minutes instance recovery, set fast_start_mttr_target to 300.
The following summarizes the ORACLE parameters related to cache recovery:
8.0 provides the log checkpoint parameters and recovery_parallelism. Hard checkpointing only (all modified buffers written).
8.0.4 adds parallel_execution_message_size.
8.1 adds fast_start_io_target and incremental checkpointing.
9.x adds fast_start_mttr_target with further enhancements to the recovery code path.
ORACLE Recovery
In addition to starting ORACLE by rolling forward, we divide startup into four distinct phases:
Start up process
Create Shared Global Area (SGA)
Mount database
Open database
The time required for each of these phases to complete varies with database configurations. You can figure the time for your system by examining the ORACLE alert_<instance_name>.log file in $ORACLE_HOME/rdbms/log. ORACLE provides date and time stamps with event messages and milestones in startup, so it's relative easy to figure out how long ORACLE takes to start up. It is recommended that you do the calculation and note these times, so you have a sense for the startup time required for the instances you are managing. The following are key lines from a typical alert log file.
CODE EXAMPLE 1 Alert Log File Sample
Tue Nov 5 18:02:30 2002 Starting ORACLE instance (normal) ... Tue Nov 5 18:07:21 2002 ALTER DATABASE MOUNT ... Tue Nov 5 18:07:26 2002 ALTER DATABASE OPEN ... Tue Nov 5 18:08:12 2002 Database Characterset is US7ASCII Completed: ALTER DATABASE OPEN
We can easily determine that a clean start of this ORACLE instance took 5 minutes and 42 seconds (from 18:02:30 to 18:08:12). Also, we can see that the majority of the time was the process and SGA initialization phase, which took 4 minutes and 51 seconds (from 18:02:30 to 18:07:21), 85% of the total start up time. We note that in this case, a relatively large SGA is configured (169 Gigabyte SGA), which is where most of the time was spent.
To understand recovery times, add the following entry to the instance init.ora file:
event="10333 trace name context forever, level 2"
With this entry, you get recovery statistics in your alert log. Internally, ORACLE tracks startup phases with microsecond granularity. This gives ORACLE the ability to come relatively close to meeting mean time to recovery (MTTR) goals based on parameter settings (fast_start_mttr_target). An actual crash recovery improves the accuracy, because ORACLE measures the time involved in reading redo log blocks and applying that data to the data blocks, then saves that information for subsequent calculations for meeting MTTR targets.
The following is sample data from an alert log after a crash.
CODE EXAMPLE 2
Wed Nov 6 07:43:08 2002 ALTER DATABASE OPEN Wed Nov 6 07:43:12 2002 Beginning crash recovery of 1 threads attempting to start a parallel recovery with 24 processes parallel recovery started with 24 processes Wed Nov 6 07:43:18 2002 Started first pass scan Wed Nov 6 07:50:35 2002 Completed first pass scan 46213731 redo blocks read, 7072518 data blocks need recovery
We set recovery_parallelism to 24, and we see the roll forward begins when ORACLE attempts to open the database. We get statistics on how much work is required to do the roll forward. In this case, we have about 46 million redo blocks and 7.1 million data blocks to process. We can time the roll forward phases. The time between "Started first pass scan" and "Completed first pass scan" is the time to read the redo log blocks, which takes about 7 minutes (07:43:18 to 07:50:35).
CODE EXAMPLE 3
Wed Nov 6 07:54:08 2002 Started recovery at Thread 1: logseq 92, block 36433268, scn 0.0
The time from "Completed first pass scan" to "Started Recovery" is the time to read and claim all data blocks that need recovery. In this case, it took just over three-and-a-half minutes (07:50:35 to 07:54:08).
The final phase of the roll forward work is applying redo log block data to data blocks. This time is between the "Started recovery" and "Completed redo application" messages. In the example, about 9 minutes is required for this last phase (7:54:08 to 8:05:17).
CODE EXAMPLE 4
Recovery of Online Redo Log: Thread 1 Group 2 Seq 92 Reading mem 0 Mem# 0 errs 0: /links/log2 Wed Nov 6 07:56:42 2002 Recovery of Online Redo Log: Thread 1 Group 1 Seq 93 Reading mem 0 Mem# 0 errs 0: /links/log1 Wed Nov 6 08:05:17 2002 Completed redo application Wed Nov 6 08:05:32 2002 Ended recovery at
The total roll forward time in this case is about 20 minutes. More precisely, using the time stamps from "Beginning crash recovery" to "Ended recovery," we have 07:43:12 to 08:05:32, which is 22 minutes and 20 seconds.
The next alert log entries provide information on the amount of work done on the roll forward and ORACLE's latest set of MTTR statistics.
CODE EXAMPLE 5
Ended recovery at Thread 1: logseq 93, block 32014541, scn 0.444646430 7072518 data blocks read, 7072518 data blocks written, 46213731 redo blocks read MTTR statistics updated from recovery MTTR statistics status: 3 Init time: Avg: 194842222, Times measured: 3 File open time: Avg: 5395, Times measured: 484 Log block read time: Avg: 9, Times measured: 46213732 Data block handling time: Avg: 126, Times measured: 7072519
We see that 7.1 megabyte data blocks were read (scanned) and written (with the redo data), and that 46.2 megabyte redo log blocks were read. This data is useful for sizing your I/O subsystem when recovery or checkpoints are taking too long. By understanding the amount of data that needs to be processed and the basic capabilities of the I/O subsystem, layout, and configuration, you can establish thresholds to improve recovery and checkpoint times.
ORACLE MTTR Statistics
The MTTR statistics are ORACLE's most recent view of the recovery. The "statistics status 3" means that statistics were updated from a recovery (as the preceding text message indicates). When a new instance is created, the MTTR statistics are initialized to a hard-coded value to provide a starting point. Also, they are updated from runtime data.
The time values reported are in microseconds and are maintained as running averages. For example, the "File open time" average is 5.4 milliseconds. If there are 500 data files in your database instance, it takes 2700 milliseconds (or just under 3 seconds) to open all files. Data block handling time averages 126 microseconds. At 7.1 megabyte data blocks, that's 894.6 million microseconds (or 895 seconds, which is about 15 minutes). The point is that there is useful data in the alert log that can help you identify issues in a system's ability to perform recovery and determine what improvements can be made.