Oracle Instance Recovery

Oracle instance recovery is an advanced concept, but its interesting to know and understand it.

Once you understand it, you will appreciate how Oracle handles the situation when suddenly an Oracle database shuts down due to a system crash.

There might be some transactions in progress, how does Oracle handles that all. Read on...

As I mentioned in my "About Oracle" article, log buffer cache holds change vectors corresponding to changed data blocks in DB buffer cache.

When some data is updated, you know NEW copy of data is stored in DB buffer cache (it is not immediately written to disk by DBWR) and a change vector goes to log buffer cache, to log the transaction, thereby ensuring transaction recoverability.

Along-with that, the OLD copy of that data (i.e. the value before change) is stored in undo segments, which are blocks that are updated just like data blocks in buffer cache. Holding this OLD copy is necessary for purpose of doing rollback and read-consistency.

Know the term SCN (system change number). SCN is just an unique number generated by the database with each transaction. When a checkpoint happens (note the SCN), Oracle ensures that all data changes till the checkpoint SCN are written to the disc.

When a database shuts down abruptly, may be due to a sudden system crash or shutdown abort command, DB is left in an inconsistent state, means all ongoing transactions (committed or non-committed) were not completed. Its a MUST for a DB to be in consistent state before it can be opened. To bring DB in consistent, Oracle performs instance recovery.

Instance recovery has 2 steps : Roll forward (or cache recovery) AND rolling back (or transaction recovery). It basically re-applies committed, and then rolls back un-committed transactions.

Roll forward or cache recovery involves reapplying relevant changes recorded in the online redo log to the data files. Because rollback data (OLD copy) is also recorded in the online redo log, rolling forward also regenerates the corresponding undo segments. So it brings DB in time.

The data files could also contain uncommitted changes that were either saved to the data files before the failure, or were recorded in the online redo log and introduced during roll forward step above.

After the roll forward, any changes that were not committed, must be undone. Oracle uses the checkpoint position, which guarantees that every committed change with an SCN lower than the checkpoint SCN is saved on disk. Oracle Database applies undo blocks to roll back uncommitted changes in data blocks that were written before the failure or introduced during cache recovery. This phase is called rolling back or transaction recovery.

Oracle instance recovery