Friday, 26 June 2015

Switchover and Failover :

This section is what Data Guard is really all about, its about a standby database taking over the production database, and how to revert back when problems have been fixed.

Role transition is divide into two types switchover and failover, they also might be called switchback, failback but they all mean the the same thing to transition the state of Data Guard from one state to another.

Switchover

Switchover is the act of change the standby database into the primary but in a controlled manor, the planned event means that it is safe from data loss because the primary database must complete all redo generation on the production data before allowing the switchover to commence. The switchback does not exists as it is a switchover but in the reserve order, which would restore the database back on its original server. This planned event normally happens during a quiet period, the reason for the switchover might be DR testing, patch, hardware changes, implementing RAC, etc.

Once the switchover is complete the redo from the new primary will send it to the remaining standby servers, including the old primary, if using either grid control or the broker this should be all automatically do for you, but if you are using SQLPlus you have to performance some manual work.

You always start the switchover on the primary database, the actual switchover command is below whether you are using Grid Control, Broker or SQLPlus.
start the switchover (primary)    

alter database commit to switchover to standby;

When the switchover command is executed the redo generation is stopped, all DML related cursors are invalidated and users are either prevented from executing transactions or terminated and he current redo log is archived for each tread. A special switchover marker called the EOR (end of redo) is then placed in the header of the next sequence for each thread, and the online redo files are archived a second time, sending the final sequences to the standby databases. At this point the physical standby database is closed and the final log switch is done without allowing the primary database to advance the sequence numbers for each thread.

After the EOR redo is sent to the standby databases, the original primary database is finalized as a standby and its control file backed up to the trace file and converted to the correct type of standby control file. In the case of a physical standby switchover the managed recovery process (MRP) is automatically started on the original primary to apply the final archive logs that contain the EOR so that all the redo ever generated is processed. The primary is then dismounted and must then be restarted as a standby database in at least the mount state.

The standby database must received this EOR redo otherwise the switchover cannot occur, once this redo has been received and applied to complete the switchover you run the following command, this will be automatic if you are using the Grid Control or the Broker
complete the switchover (new primary)   


alter database commit to switchover to primary;

The physical standby switchover will wait for the MRP process to exit after processing the EOR redo and then convert the standby control file into a normal production control file. The final thing to do is to open the database for general production use
complete the switchover (new primary)    


alter database open;

A logical standby also has to wait for the EOR redo from the primary to be applied and SQL apply to shut down before the switchover command can complete, once the EOR has been processed, the GUARD can be turned off and production processing can begin.

Failover



A failover is a unplanned event when something has happened to hardware, networking, etc. This is when you invoke you DR procedures (hopefully documented), and you will have full confidence in getting the new primary up and running as quickly as possible. Unlike the switchover which begins on the primary, no primary is involved which means you will not be able to get the redo from the primary. Depending on what protection mode you have chosen there may be data loss (less you have a Maximum Protection mode enabled), you start be telling Data Guard to apply the remaining redo that it can. Once the redo has been applied you run the same command that you do with a physical standby to switchover the standby to a primary.


complete the switchover (new primary)   

alter database commit to switchover to primary; 


Once difference is when the switchover has completed the protection mode will be maximum performance regardless what it was before, to get it back to your original protection mode you must get a standby database back up and running, then manually execute the steps to get it into the protection mode you want.
change the protection mode    

# Choose what level of protection you require

alter database set standby to maximize performance;
alter database set standby to maximize availability;
alter database set standby to maximize protection;


If you are using a protection mode that may result in data, the received archive redo logs are merged into a a single thread and the sequence is sorted on the dependant transaction, this merged thread is then applied to the standby database up until the last redo. This may take sometime if using a RAC environment as the redo data has to be transfers from each instance.

Since the redo heartbeat is sent every 6 seconds or so, the general rule is that you may lose 6 seconds of redo during a failover but this is a best guess. At failover the merging thread will look at the last log of the disconnected thread and use the last heartbeat in it to define the consistent point, throwing away all the redo that the surviving nodes had been sending all along.


No comments:

Post a Comment