Wednesday 8 July 2015

Diagnosing and Repairing Failures with 11g Data Recovery Advisor

The 11g Data Recovery Advisor is part of the 11g database health checking framework and diagnoses persistent data failures and not only presents options to repair and fix the problem but also can execute the repair and recovery process at our request.

The Repair Advisor can take away lot of the stress associated with peforming backup and recovery by diagnosing what is wrong as well as presenting us with the syntax as well to execute the commands to restore and recover as the case may be. Under pressure, everyone can make mistakes and it is comforting to know that there is a tool which can really he;p the DBA.

The Data Recovery Advisor can be used via OEM Database or Grid Control or via the RMAN command line interface.

Let us look at an example of using the RMAN Data Recovery Advisor to recover from a loss of control files situation with and without the CONTROL AUTOBACKUP option being enabled.

Note, that when there is no control file autobackup, the RMAN Repair Advisor is not able to do the full automated recovery for us and we use a combination of automatic and manual repair to fix the problem.


Scenario is loss of control files – AUTOBACKUP is enabled


RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
———- ——– ——— ————- ——-
5652 CRITICAL OPEN 18-JUN-12 Control file /u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl is missing
5649 CRITICAL OPEN 18-JUN-12 Control file /u01/app/oracle/oradata/sqlfun/control01.ctl is missing

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
———- ——– ——— ————- ——-
5304 CRITICAL OPEN 18-JUN-12 Control file /u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl is missing

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=133 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
no manual actions available

Automated Repair Options
========================
Option Repair Description
—— ——————
1 Use a multiplexed copy to restore control file /u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/sqlfun/sqlfun/hm/reco_632546057.hm

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/sqlfun/sqlfun/hm/reco_632546057.hm

contents of repair script:
# restore control file using multiplexed copy
restore controlfile from ‘/u01/app/oracle/oradata/sqlfun/control01.ctl';
sql ‘alter database mount';

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/sqlfun/sqlfun/hm/reco_632546057.hm

contents of repair script:
# restore control file using multiplexed copy
restore controlfile from ‘/u01/app/oracle/oradata/sqlfun/control01.ctl';
sql ‘alter database mount';

Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script

Starting restore at 18-JUN-12
using channel ORA_DISK_1

channel ORA_DISK_1: copied control file copy
output file name=/u01/app/oracle/oradata/sqlfun/control01.ctl
output file name=/u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl
Finished restore at 18-JUN-12

sql statement: alter database mount
released channel: ORA_DISK_1
repair failure complete

Do you want to open the database (enter YES or NO)? YES
database opened

RMAN>

RMAN> list failure;

no failures found that match specification


Scenario is loss of control files – No AUTOBACKUP


RMAN> list failure;

using target database control file instead of recovery catalog
List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
———- ——– ——— ————- ——-
5652 CRITICAL OPEN 18-JUN-12 Control file /u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl is missing
5649 CRITICAL OPEN 18-JUN-12 Control file /u01/app/oracle/oradata/sqlfun/control01.ctl is missing

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
———- ——– ——— ————- ——-
5652 CRITICAL OPEN 18-JUN-12 Control file /u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl is missing
5649 CRITICAL OPEN 18-JUN-12 Control file /u01/app/oracle/oradata/sqlfun/control01.ctl is missing

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=135 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
1. If file /u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl was unintentionally renamed or moved, restore it
2. If file /u01/app/oracle/oradata/sqlfun/control01.ctl was unintentionally renamed or moved, restore it
3. If you have a CREATE CONTROLFILE script, use it to create a new control file
4. Contact Oracle Support Services if the preceding recommendations cannot be used, or if they do not fix the failures selected for repair

Optional Manual Actions
=======================
1. If a standby database is available, then perform a Data Guard failover initiated from the standby

Automated Repair Options
========================
no automatic repair options available

RMAN> repair failure preview;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of repair command at 06/18/2012 11:00:06
RMAN-06953: no automatic repairs were listed by ADVISE FAILURE

Find the last database backup of control file in FRA

RMAN> restore controlfile from ‘/u01/app/oracle/flash_recovery_area/SQLFUN/autobackup/2012_06_18/o1_mf_s_786251074_7xwbl3l4_.bkp';

Starting restore at 18-JUN-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=125 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
output file name=/u01/app/oracle/oradata/sqlfun/control01.ctl
output file name=/u01/app/oracle/flash_recovery_area/sqlfun/control02.ctl
Finished restore at 18-JUN-12

RMAN> list failure;

no failures found that match specification

RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

RMAN> alter database open resetlogs;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 06/18/2012 11:36:01
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘/u01/app/oracle/oradata/sqlfun/system01.dbf’

RMAN> list failure;

List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
———- ——– ——— ————- ——-
5898 CRITICAL OPEN 18-JUN-12 System datafile 1: ‘/u01/app/oracle/oradata/sqlfun/system01.dbf’ needs media recovery
5895 CRITICAL OPEN 18-JUN-12 Control file needs media recovery
8 HIGH OPEN 18-JUN-12 One or more non-system datafiles need media recovery

RMAN> advise failure;

Starting implicit crosscheck backup at 18-JUN-12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=125 device type=DISK
Crosschecked 3 objects
Finished implicit crosscheck backup at 18-JUN-12

Starting implicit crosscheck copy at 18-JUN-12
using channel ORA_DISK_1
Crosschecked 5 objects
Finished implicit crosscheck copy at 18-JUN-12

searching for all files in the recovery area
cataloging files…
cataloging done

List of Cataloged Files
=======================
File Name: /u01/app/oracle/flash_recovery_area/SQLFUN/backupset/2012_06_18/o1_mf_ncsnf_TAG20120618T112825_7xx83b9m_.bkp
File Name: /u01/app/oracle/flash_recovery_area/SQLFUN/autobackup.old/2012_06_18/o1_mf_s_786277031_7xx3x7pw_.bkp.old
File Name: /u01/app/oracle/flash_recovery_area/SQLFUN/autobackup.old/2012_06_18/o1_mf_s_786251074_7xwbl3l4_.bkp
File Name: /u01/app/oracle/flash_recovery_area/SQLFUN/autobackup.old/2012_06_18/o1_mf_s_786281256_7xx8184v_.bkp
File Name: /u01/app/oracle/flash_recovery_area/SQLFUN/autobackup.old/2012_06_18/o1_mf_s_786279412_7xx67nlv_.bkp

List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
———- ——– ——— ————- ——-
5898 CRITICAL OPEN 18-JUN-12 System datafile 1: ‘/u01/app/oracle/oradata/sqlfun/system01.dbf’ needs media recovery
5895 CRITICAL OPEN 18-JUN-12 Control file needs media recovery
8 HIGH OPEN 18-JUN-12 One or more non-system datafiles need media recovery

analyzing automatic repair options; this may take some time
using channel ORA_DISK_1
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
1. If you have the correct version of the control file, then shutdown the database and replace the old control file
2. If you restored the wrong version of data file /u01/app/oracle/oradata/sqlfun/system01.dbf, then replace it with the correct one
3. If you restored the wrong version of data file /u01/app/oracle/oradata/sqlfun/sysaux01.dbf, then replace it with the correct one
4. If you restored the wrong version of data file /u01/app/oracle/oradata/sqlfun/undotbs01.dbf, then replace it with the correct one
5. If you restored the wrong version of data file /u01/app/oracle/oradata/sqlfun/users01.dbf, then replace it with the correct one
6. If you restored the wrong version of data file /u01/app/oracle/oradata/sqlfun/threatened_fauna_data.dbf, then replace it with the correct one

Automated Repair Options
========================
Option Repair Description
—— ——————
1 Recover database
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/sqlfun/sqlfun/hm/reco_2266295139.hm

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/sqlfun/sqlfun/hm/reco_2266295139.hm

contents of repair script:
# recover database
recover database;
alter database open resetlogs;

RMAN> repair failure noprompt;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/sqlfun/sqlfun/hm/reco_2266295139.hm

contents of repair script:
# recover database
recover database;
alter database open resetlogs;
executing repair script

Starting recover at 18-JUN-12
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 1 is already on disk as file /u01/app/oracle/oradata/sqlfun/redo01.log
archived log file name=/u01/app/oracle/oradata/sqlfun/redo01.log thread=1 sequence=1
media recovery complete, elapsed time: 00:00:01
Finished recover at 18-JUN-12

database opened
repair failure complete

RMAN>

No comments:

Post a Comment