Recovery Principles
Recovery principles are the same, regardless of whether you are in a Unix or Windows NT environment. The following are general guidelines for recovery using a cold backup, hot backup, and export.
Definitions
Control FileThe control file contains records that describe and maintain information about the physical structure of a database. The control file is updated continuously during database use and must be available for writing whenever the database is open. If the control file is not accessible, the database will not open.
System Change Number (SCN)The system change number is a clock value for the database that describes a committed version of the database. The SCN functions as a sequence generator for a database and controls concurrency and redo record ordering. Think of the SCN as a timestamp that helps ensure transaction consistency.
CheckpointA checkpoint is a data structure in the control file that defines a consistent point of the database across all threads of a redo log. Checkpoints are similar to SCNs and they also describe which threads exist at that SCN. Checkpoints are used by recovery to ensure that Oracle starts reading the log threads for the redo application at the correct point. For a parallel server, each checkpoint has its own redo information.
Media Recovery Commands
To perform either a complete media recovery or incomplete media recovery, you need to be familiar with the following three media recovery commands.
RECOVER DATABASE This command performs a media recovery on all the data files that require the application of redo.
This can be used only when the database is mounted but not open.
This command is generally used when the system data file is lost.
RECOVER TABLESPACE tablespace_name This command performs a media recovery on all the data files in the tablespaces listed.
The database must be mounted and open.
The tablespace in question must be offline to perform the media recovery.
To recover the tablespace, you need to mount the database first, put the data file that is in trouble offline, and then open the database and put the tablespace offline. Then give the recover tablespace tablespace_name command and put the tablespace online when the recovery is complete.
RECOVER DATAFILE 'filename' This command performs a recovery on listed data files.
The database can be open or closed.
If the database is open, data file recovery can only recover offline files.
To recover the data file in question, mount the database and put the troubled data file offline, open the database and issue the 'RECOVER DATAFILE 'FILE_NAME' command, and put the data file online. This command is generally used when a non-system data file is lost.
Performing Recovery, Where to Start?
You are a new DBA and you get a call from the project manager saying that the users are not able to connect to the database.
As a first step, try to establish a connection for yourself as a DBA as shown. If the connection succeeds, try to connect as a regular user and see if you receive any errors during connection, because some errors that are seen by regular users do not show up when you connect as Internal or SYSDBA (such as Max sessions reached).
$sqlplus user/pwd
Now you determined that you are not able to connect to the database.
As a second step, try to see whether the processes are running by using the following command.
$ps ef|grep i ORCL
This should list the processes that are running. If it does not list any processes, you are sure that the database is down.
As a third step, check the alert log file for any errors. The alert log file is located under the directory defined by BACKGROUND_DUMP_DEST in the Init.ora file.
This file lists any errors encountered by database. If you see any errors, note the time of the error, error number, and error message. If you do not see any errors, start up the database (sometimes it will report an error when you try to startup the database). If the database starts, that is wonderful! If it doesn't start, it will generally complain about the error onscreen and also report the error in the alert log file. Check the alert log again for more information.
Now you determined from the error that the database is not finding one of the data files.
As a fourth step, inform the project manager that somebody has caused a problem in the database and try to find out what happened (a hard disk problem or perhaps somebody deleted the file). Limit your time to this research based on time available.
As a fifth step, try to determine what kind of backups you have taken recently and see which one is most beneficial for recovering as much data as possible. This depends on the types of backups your site is employing to protect from database crashes.
If you have a hot backup mechanism in place, you can be sure that you can recover all or most of the data. If you have an export or cold backup mechanism in place, the data changes since the time of last backup will be lost.
As a sixth step, follow the instructions in this chapter, given your recovery scenario.
Recovery Using Cold Backup
To restore a full database, do the following:
Shutdown the database.
Copy all data files, control files, and redo log files from the backup location to the original location. Verify the owner and permissions for the files (for Unix only).
Start up the database.
Recovery When a Data File Is Lost
To recover a database using c cold backup, just restore all the files from the backup location to their original locations and open the database. You can find the original physical location in the trace file you generated as part of the backup. You cannot recover the transactions that occurred between the last backup and the point of failurethat information is lost.
Recovery When a Redo Log File Is LostTo recover the database when a redo log file is lost or corrupted
alter database clear logfile group 1;
Where group 1 is the corrupted log group number.
Or you can create a new control file and open the database in the Reset Logs mode (alter database open resetlogs). For this the database need to be in NOMOUNT state (startup NOMOUNT). The reset logs option resets the redo log sequence numbering and recreates any missing logfiles. To create the new control file, you need to know the full structure of the database. We have taken the trace of control file by using Alter database backup controlfile to trace as part of the backup. Follow the steps explained in Chapter 10, "Database Maintenance and Reorganization," for creating a new control file.
Recovery When a Control File Is Lost
To recover the database in case of a lost control file, you simply recreate the control file knowing the structure of the database (from the trace of control file) and open the database with reset logs. Follow the steps explained in Chapter 10 for creating a new control file.
Recovery Using Hot Backup
When the database is running in ARCHIVELOG mode and online backup is being used, there are a variety of options for recovering the database, up to the point of failure, that provide maximum protection for your data.
Recovery can be classified as follows:
Complete media recovery
Closed database recovery
Open database/offline tablespace recovery
Open database/offline tablespace/individual data file recovery
Incomplete media recovery
Cancel-based recovery
Time-based recovery
Change-based recovery
Complete Media Recovery
At all costs, we want to be able to fully recover the data in case of a database failure. Consequently, we always try to perform a complete recovery unless the need is to recover the database only to a specific point in time for specific reasons, such as those discussed in the next section, "Incomplete Media Recovery."
The choice of whether to use a closed or open database recovery is based on the type of failure. If you lose system data files, the only choice is a closed database recovery. If a non-system data file is lost, you can perform recovery by using either a closed or open database method. Suppose that you are running a 24/7, mission-critical database, and only part of the database (non-system) is damaged. In this situation, you can open the database for users by taking the damaged data files offline and then performing a recovery on the damaged files. This way, users can access the rest of the database while the recovery is being performed on the damaged data files.
Incomplete Media Recovery
Incomplete media recovery is very useful as well, if a user drops a table accidentally and comes to you for help, for example. If you know the time the table drop occurred, you can restore the database from a backup. By using the latest control file, you can roll forward the changes by applying redo log files up to the point just before the accidental drop (time-based recovery).
Point in Time Recovery
There was a database corruption at 5 p.m. in the evening and the database crashed. When I tried to bring up the database, the database opened and immediately died as soon as I started executing any SQL statement. This crippled my ability to perform troubleshooting of the problem. I restored the database from a backup and applied the archive redo log files up to just before the time of the crash and the database came up fine. Remember, you have to use the latest control file to roll forward with the archived redo log files, so that the Oracle knows what archived redo log files to apply.
Closed Database Recovery Steps
-
Restore the damaged files from backup.
-
With the following command, mount the database but do not open it:
startup mount
- Start media recovery as follows:
recover database
At this point, you will be prompted for the location of the archived redo log files, if necessary.
-
Open the database:
alter database open
Verify that the recovery worked.
Offline Tablespace Recovery Steps
-
Restore the damaged files from the backup.
-
With the following command, mount database but do not open it:
startup mount
Take the corrupted data file offline:
alter datafile '/u01/oradata/users01.dbf' offline;
Open the database as follows:
alter database open;
After the database is open, take the tablespace offline. For example, if the corrupted data file belongs to USERS tablespace, use the following command:
alter tablespace users offline;
Here, tablespace can be taken offline either with a normal, temporary, or immediate priority. If possible, take the damaged tablespace offline with a normal or temporary priority to minimize the amount of recovery.
-
Start the recovery on the tablespace:
recover tablespace users;
At this point, you will be prompted for the location of the archived redo log files, if necessary.
- Bring the tablespace online:
alter tablespace users online;
-
Verify that the recovery worked.
Offline Datafile Recovery Steps
-
Restore the damaged files from the backup.
- Using the following command, mount the database but do not open it:
Startup mount
- Take the corrupted data file offline:
alter datafile '/u01/oradata/users01.dbf' offline;
- Open the database:
alter database open;
- After the database is open, take the tablespace offline. For example, if
the corrupted data file belongs to USERS tablespace, use the following command:
alter tablespace users offline;
Here, tablespace can be taken offline either with a normal, temporary, or immediate priority. If possible, take the damaged tablespace offline with a normal or temporary priority to minimize the amount of recovery.
- Start the recovery on the data file:
recover datafile '/u01/oradata/users01.dbf';
At this point, you will be prompted for the location of the archived redo log files, if necessary.
- Bring the tablespace online:
alter tablespace users online;
-
Verify that the recovery worked.
Cancel-Based Recovery Steps
-
Restore the damaged files from the backup.
-
Using the following command, mount the database but do not open it:
startup mount
Start the recovery:
recover database until cancel [using backup controlfile]
At this point, you will be prompted for the location of the archived redo log files, if necessary. Enter cancel to cancel recovery after Oracle has applied the archived redo log file just prior to the point of corruption. If a backup control file or recreated control file is being used with incomplete recovery, you should specify the using backup controlfile option. In cancel-based recovery, you cannot stop in the middle of applying a redo log file. You either completely apply a redo log file or you don't apply it at all. In time-based recovery, you can apply to a specific point in time, regardless of the archived redo log number.
Open the database:
alter database open resetlogs
Whenever an incomplete media recovery is being performed or the backup control file is used for recovery, the database should be opened with the resetlogs option. The resetlogs option will reset the redo log files.
Perform a full backup of database.
If you open the database with resetlogs, a full backup of the database should be performed immediately after recovery. Otherwise, you will not be able to recover changes made after you reset the logs.
Verify that the recovery worked.
Time-Based Recovery Steps
Restore the damaged files from the backup.
Using the following command, mount the database but do not open it:
startup mount
Start the recovery:
recover database until time [using backup controlfile]
For example
recover database until time '1999-01-01:12:00:00' using backup controlfile
At this point, you will be prompted for the location of the archived redo log files, if necessary. Oracle automatically terminates the recovery when it reaches the correct time. If a backup control file or recreated control file is being used with incomplete recovery, you should specify the using backup controlfile option.
Open the database:
alter database open resetlogs
Whenever an incomplete media recovery is being performed or the backup control file is used, the database should be opened with the resetlogs option, so that it resets the log numbering.
Perform a full backup of the database.
If the database is opened with resetlogs, a full backup of the database should be performed immediately after recovery. Otherwise, you will not be able to recover the changes made after you reset the logs.
Verify that the recovery worked.
Change-Based Recovery Steps
Restore the damaged files from the backup.
Using the following command, mount the database but do not open it:
startup mount
Start the recovery:
recover database until change [using backup controlfile]
For example
recover database until change 2315 using backup controlfile
At this point, you will be prompted for the location of the archived redo log files, if necessary. Oracle automatically terminates the recovery when it reaches the correct system change number (SCN).
If a backup control file or a recreated control file is being used with an incomplete recovery, you should specify using the backup controlfile option.
Open the database.
alter database open resetlogs
Perform a full backup of the database.
If the database is opened with resetlogs, a full backup of the database should be performed immediately after recovery. Otherwise, you will not be able to recover the changes made after you reset the logs.
- Verify that the recovery worked.
System Tablespace Versus a Non-System Tablespace Recovery
When a system data file is lost or damaged, the only way to recover the database is by doing a closed database recovery using RECOVER DATABASE command.
Checking for Files Needing Recovery
The following command can be used to check the data file status. This command works when the database is mounted or open.
select name, status from v$datafile;
Before you actually start recovering the database, you can obtain information about the files that need recovery by executing the following command. To execute the statement, the database must be mounted. The command also gives error information.
select b.name, a.error from v$recover_file a, v$datafile b where a.file# = b.file#
Recovery Using Import
The import utility is used to import the database from the dump file generated through the export utility. This is very useful for transferring data across platforms and importing only specific objects or users. It works whether archiving is turned on or off. Full database import performance can be improved by turning off archiving during the import.
There are three levels of Import:
Full
User-level
Table-level
Full Import
A full import can be used to restore the database in case of a database crash. For example, you have a full export of the database from yesterday and your database crashed this afternoon. You can use the import command to restore the database from the previous day's backup. The restore steps are as follows.
-
Create a blank databaseRefer to Chapter 10 for instructions on how to create a database.
-
Import the databaseThe following command performs a full database import, assuming that your export dump filename is export.dmp. The IGNORE=Y option ignores any create errors, and the DESTROY=N option does not destroy the existing tablespaces.
C:\>imp system/manager file=export.dmp log=import.log full=y ignore=y destroy=n
- Verify the import log for any errorsWith this import, the data changes between your previous backup and the crash will be lost.
Table-Level Import
A table level import allows you to import specific objects without importing the whole database.
Example 1:
For example, if one of the developers requests that you transfer the EMP and DEPT tables of user SCOTT from database ORCL to TEST. You can use the following steps to transfer these two tables.
- Set your ORACLE_SID to ORCL.
C:\>set ORACLE_SID=ORCL
This step sets the correct database to which to connect.
-
Perform an export of EMP and DEPT.
C:\>exp system/manager tables=(scott.emp,scott.dept) file=export.dmp log=export.log
This command exports table data, constraints and any indexes on the table. Because the tables belong to owner scott, we need to precede them with the owner in the export command. Verify the export.log file to make sure there are no errors in the export.
- Connect to TEST database.
SQL>Connect system/manager@TEST
-
Drop the tables if it already exists.
If the TEST database already has EMP and DEPT tables, you can truncate the tables or drop the tables as shown.
SQL>Truncate table EMP;SQL>Truncate table DEPT;
Or
SQL>Drop table EMP;SQL>Drop table DEPT;
-
Import the tables to TEST.
C:\>set ORACLE_SID=TEST C:\>imp system/manager fromuser=scott touser=scott tables=(EMP,DEPT) file=export.dmp log=import.log ignore=Y
Check for any errors in the import log file.
Example 2:
Suppose you walk into the office in the morning and a developer meets you in the hallway and says that he accidentally dropped the SALES table. He wants to see whether you can do anything to restore the table.
Well, you could do something if you have an export dump file from your previous backup. The steps to restore the table are as follows (assuming that this happened in the TEST database):
Set your ORACLE_SID to the TEST database.
C:\>set ORACLE_SID=TEST
Import the table from previous backup.
C:\>imp system/manager tables=(SCOTT.SALES) file=export.dmp log=import.log ignore=Y
This command imports the SALES table from previous backup. After the import check the import log file for any errors.