Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Data Guard Failover after primary site network failure or disconnect.

ActitudDec 26 2013 — edited Apr 22 2014

Hello Experts:

I'll try to be clear and specific with my issue:

Environment:

  • Two nodes with NO shared storage (I don't have an Observer running).
  • Veritas Cluser Server (VCS) with Data Guar Agent. (I don't use the Broker. Data Guard agent "takes care" of the switchover and failover).
  • Two single instance databases, one per node. NO RAC.

What I'm being able to perform with no issues:

  • Manual switch(over) of the primary database by running VCS command "hagrp -switch oraDG_group -to standby_node"
  • Automatic fail(over) when primary node is rebooted with "reboot" or "init"
  • Automatic fail(over) when primary node is shut down with "shutdown".

What I'm NOT being able to perform:

  • If I manually unplug the network cables from the primary site (all the network, not only the link between primary and standby node so, it's like a server unplug from the energy source).
  • Same situation happens if I manually disconnect the server from the power.
  • This is the alert logs I have:

This is the portion of the alert log at Standby site when Real Time Replication is working fine:

Recovery of Online Redo Log: Thread 1 Group 4 Seq 7 Reading mem 0

  Mem# 0: /u02/oracle/fast_recovery_area/standby_db/onlinelog/o1_mf_4_9c3tk3dy_.log

At this moment, node1 (Primary) is completely disconnected from the network. SEE at the end when the database (standby which should be converted to PRIMARY) is not getting all the archived logs from the Primary due to the abnormal disconnect from the network:

Identified End-Of-Redo (failover) for thread 1 sequence 7 at SCN 0xffff.ffffffff

Incomplete Recovery applied until change 15922544 time 12/23/2013 17:12:48

Media Recovery Complete (primary_db)

Terminal Recovery: successful completion

Forcing ARSCN to IRSCN for TR 0:15922544

Mon Dec 23 17:13:22 2013

ARCH: Archival stopped, error occurred. Will continue retrying

ORACLE Instance primary_db - Archival ErrorAttempt to set limbo arscn 0:15922544 irscn 0:15922544

ORA-16014: log 4 sequence# 7 not archived, no available destinations

ORA-00312: online log 4 thread 1: '/u02/oracle/fast_recovery_area/standby_db/onlinelog/o1_mf_4_9c3tk3dy_.log'

Resetting standby activation ID 2071848820 (0x7b7de774)

Completed:  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH

Mon Dec 23 17:13:33 2013

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH

Terminal Recovery: applying standby redo logs.

Terminal Recovery: thread 1 seq# 7 redo required

Terminal Recovery:

Recovery of Online Redo Log: Thread 1 Group 4 Seq 7 Reading mem 0

  Mem# 0: /u02/oracle/fast_recovery_area/standby_db/onlinelog/o1_mf_4_9c3tk3dy_.log

Identified End-Of-Redo (failover) for thread 1 sequence 7 at SCN 0xffff.ffffffff

Incomplete Recovery applied until change 15922544 time 12/23/2013 17:12:48

Media Recovery Complete (primary_db)

Terminal Recovery: successful completion

Forcing ARSCN to IRSCN for TR 0:15922544

Mon Dec 23 17:13:22 2013

ARCH: Archival stopped, error occurred. Will continue retrying

ORACLE Instance primary_db - Archival ErrorAttempt to set limbo arscn 0:15922544 irscn 0:15922544

ORA-16014: log 4 sequence# 7 not archived, no available destinations

ORA-00312: online log 4 thread 1: '/u02/oracle/fast_recovery_area/standby_db/onlinelog/o1_mf_4_9c3tk3dy_.log'

Resetting standby activation ID 2071848820 (0x7b7de774)

Completed:  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH

Mon Dec 23 17:13:33 2013

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH

Attempt to do a Terminal Recovery (primary_db)

Media Recovery Start: Managed Standby Recovery (primary_db)

started logmerger process

Mon Dec 23 17:13:33 2013

Managed Standby Recovery not using Real Time Apply

Media Recovery failed with error 16157

Recovery Slave PR00 previously exited with exception 283

ORA-283 signalled during:  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH...

Mon Dec 23 17:13:34 2013

Shutting down instance (immediate)

Shutting down instance: further logons disabled

Stopping background process MMNL

Stopping background process MMON

License high water mark = 38

All dispatchers and shared servers shutdown

ALTER DATABASE CLOSE NORMAL

ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...

ALTER DATABASE DISMOUNT

Shutting down archive processes

Archiving is disabled

Mon Dec 23 17:13:38 2013

Mon Dec 23 17:13:38 2013

Mon Dec 23 17:13:38 2013

ARCH shutting downARCH shutting down

ARCH shutting down

ARC0: Relinquishing active heartbeat ARCH role

ARC2: Archival stopped

ARC0: Archival stopped

ARC1: Archival stopped

Completed: ALTER DATABASE DISMOUNT

ARCH: Archival disabled due to shutdown: 1089

Shutting down archive processes

Archiving is disabled

Mon Dec 23 17:13:40 2013

Stopping background process VKTM

ARCH: Archival disabled due to shutdown: 1089

Shutting down archive processes

Archiving is disabled

Mon Dec 23 17:13:43 2013

Instance shutdown complete

Mon Dec 23 17:13:44 2013

Adjusting the default value of parameter parallel_max_servers

from 1280 to 470 due to the value of parameter processes (500)

Starting ORACLE instance (normal)

************************ Large Pages Information *******************

Per process system memlock (soft) limit = 64 KB

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)

Large Pages unused system wide = 0 (0 KB)

Large Pages configured system wide = 0 (0 KB)

Large Page size = 2048 KB

RECOMMENDATION:

  Total System Global Area size is 3762 MB. For optimal performance,

  prior to the next instance restart:

  1. Increase the number of unused large pages by

at least 1881 (page size 2048 KB, total size 3762 MB) system wide to

  get 100% of the System Global Area allocated with large pages

  2. Large pages are automatically locked into physical memory.

Increase the per process memlock (soft) limit to at least 3770 MB to lock

100% System Global Area's large pages into physical memory

********************************************************************

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

Initial number of CPU is 32

Number of processor cores in the system is 16

Number of processor sockets in the system is 2

CELL communication is configured to use 0 interface(s):

CELL IP affinity details:

    NUMA status: NUMA system w/ 2 process groups

    cellaffinity.ora status: cannot find affinity map at '/etc/oracle/cell/network-config/cellaffinity.ora' (see trace file for details)

CELL communication will use 1 IP group(s):

    Grp 0:

Picked latch-free SCN scheme 3

Autotune of undo retention is turned on.

IMODE=BR

ILAT =88

LICENSE_MAX_USERS = 0

SYS auditing is disabled

NUMA system with 2 nodes detected

Starting up:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options.

ORACLE_HOME = /u01/oracle/product/11.2.0.4

System name:    Linux

Node name:      node2.localdomain

Release:        2.6.32-131.0.15.el6.x86_64

Version:        #1 SMP Tue May 10 15:42:40 EDT 2011

Machine:        x86_64

Using parameter settings in server-side spfile /u01/oracle/product/11.2.0.4/dbs/spfileprimary_db.ora

System parameters with non-default values:

  processes                = 500

  sga_target               = 3760M

  control_files            = "/u02/oracle/orafiles/primary_db/control01.ctl"

  control_files            = "/u01/oracle/fast_recovery_area/primary_db/control02.ctl"

  db_file_name_convert     = "standby_db"

  db_file_name_convert     = "primary_db"

  log_file_name_convert    = "standby_db"

  log_file_name_convert    = "primary_db"

  control_file_record_keep_time= 40

  db_block_size            = 8192

  compatible               = "11.2.0.4.0"

  log_archive_dest_1       = "location=/u02/oracle/archivelogs/primary_db"

  log_archive_dest_2       = "SERVICE=primary_db ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=primary_db"

  log_archive_dest_state_2 = "ENABLE"

  log_archive_min_succeed_dest= 1

  fal_server               = "primary_db"

  log_archive_trace        = 0

  log_archive_config       = "DG_CONFIG=(primary_db,standby_db)"

  log_archive_format       = "%t_%s_%r.dbf"

  log_archive_max_processes= 3

  db_recovery_file_dest    = "/u02/oracle/fast_recovery_area"

  db_recovery_file_dest_size= 30G

  standby_file_management  = "AUTO"

  db_flashback_retention_target= 1440

  undo_tablespace          = "UNDOTBS1"

  remote_login_passwordfile= "EXCLUSIVE"

  db_domain                = ""

  dispatchers              = "(PROTOCOL=TCP) (SERVICE=primary_dbXDB)"

  job_queue_processes      = 0

  audit_file_dest          = "/u01/oracle/admin/primary_db/adump"

  audit_trail              = "DB"

  db_name                  = "primary_db"

  db_unique_name           = "standby_db"

  open_cursors             = 300

  pga_aggregate_target     = 1250M

  dg_broker_start          = FALSE

  diagnostic_dest          = "/u01/oracle"

Mon Dec 23 17:13:45 2013

PMON started with pid=2, OS id=29108

Mon Dec 23 17:13:45 2013

PSP0 started with pid=3, OS id=29110

Mon Dec 23 17:13:46 2013

VKTM started with pid=4, OS id=29125 at elevated priority

VKTM running at (1)millisec precision with DBRM quantum (100)ms

Mon Dec 23 17:13:46 2013

GEN0 started with pid=5, OS id=29129

Mon Dec 23 17:13:46 2013

DIAG started with pid=6, OS id=29131

Mon Dec 23 17:13:46 2013

DBRM started with pid=7, OS id=29133

Mon Dec 23 17:13:46 2013

DIA0 started with pid=8, OS id=29135

Mon Dec 23 17:13:46 2013

MMAN started with pid=9, OS id=29137

Mon Dec 23 17:13:46 2013

DBW0 started with pid=10, OS id=29139

Mon Dec 23 17:13:46 2013

DBW1 started with pid=11, OS id=29141

Mon Dec 23 17:13:46 2013

DBW2 started with pid=12, OS id=29143

Mon Dec 23 17:13:46 2013

DBW3 started with pid=13, OS id=29145

Mon Dec 23 17:13:46 2013

LGWR started with pid=14, OS id=29147

Mon Dec 23 17:13:46 2013

CKPT started with pid=15, OS id=29149

Mon Dec 23 17:13:46 2013

SMON started with pid=16, OS id=29151

Mon Dec 23 17:13:46 2013

RECO started with pid=17, OS id=29153

Mon Dec 23 17:13:46 2013

MMON started with pid=18, OS id=29155

Mon Dec 23 17:13:46 2013

MMNL started with pid=19, OS id=29157

starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...

starting up 1 shared server(s) ...

ORACLE_BASE from environment = /u01/oracle

Mon Dec 23 17:13:46 2013

ALTER DATABASE   MOUNT

ARCH: STARTING ARCH PROCESSES

Mon Dec 23 17:13:50 2013

ARC0 started with pid=23, OS id=29210

ARC0: Archival started

ARCH: STARTING ARCH PROCESSES COMPLETE

ARC0: STARTING ARCH PROCESSES

Successful mount of redo thread 1, with mount id 2071851082

Mon Dec 23 17:13:51 2013

ARC1 started with pid=24, OS id=29212

Allocated 15937344 bytes in shared pool for flashback generation buffer

Mon Dec 23 17:13:51 2013

ARC2 started with pid=25, OS id=29214

Starting background process RVWR

ARC1: Archival started

ARC1: Becoming the 'no FAL' ARCH

ARC1: Becoming the 'no SRL' ARCH

Mon Dec 23 17:13:51 2013

RVWR started with pid=26, OS id=29216

Physical Standby Database mounted.

Lost write protection disabled

Completed: ALTER DATABASE   MOUNT

Mon Dec 23 17:13:51 2013

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE

         USING CURRENT LOGFILE DISCONNECT FROM SESSION

Attempt to start background Managed Standby Recovery process (primary_db)

Mon Dec 23 17:13:51 2013

MRP0 started with pid=27, OS id=29219

MRP0: Background Managed Standby Recovery process started (primary_db)

ARC2: Archival started

ARC0: STARTING ARCH PROCESSES COMPLETE

ARC2: Becoming the heartbeat ARCH

ARC2: Becoming the active heartbeat ARCH

ARCH: Archival stopped, error occurred. Will continue retrying

ORACLE Instance primary_db - Archival Error

ORA-16014: log 4 sequence# 7 not archived, no available destinations

ORA-00312: online log 4 thread 1: '/u02/oracle/fast_recovery_area/standby_db/onlinelog/o1_mf_4_9c3tk3dy_.log'

At this moment, I've lost service and I have to wait until the prmiary server goes up again to receive the missing log.

This is the rest of the log:

***********************************************************************

Fatal NI connect error 12543, connecting to:

(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=primary_db)(CID=(PROGRAM=oracle)(HOST=node2.localdomain)(USER=oracle))))

  VERSION INFORMATION:

        TNS for Linux: Version 11.2.0.4.0 - Production

        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production

  Time: 23-DEC-2013 17:13:52

  Tracing not turned on.

  Tns error struct:

    ns main err code: 12543

TNS-12543: TNS:destination host unreachable

    ns secondary err code: 12560

    nt main err code: 513

TNS-00513: Destination host unreachable

    nt secondary err code: 113

    nt OS err code: 0

***********************************************************************

Fatal NI connect error 12543, connecting to:

(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=primary_db)(CID=(PROGRAM=oracle)(HOST=node2.localdomain)(USER=oracle))))

  VERSION INFORMATION:

        TNS for Linux: Version 11.2.0.4.0 - Production

        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production

  Time: 23-DEC-2013 17:13:55

  Tracing not turned on.

  Tns error struct:

    ns main err code: 12543

TNS-12543: TNS:destination host unreachable

    ns secondary err code: 12560

    nt main err code: 513

TNS-00513: Destination host unreachable

    nt secondary err code: 113

    nt OS err code: 0

started logmerger process

Mon Dec 23 17:13:56 2013

Managed Standby Recovery starting Real Time Apply

MRP0: Background Media Recovery terminated with error 16157

Errors in file /u01/oracle/diag/rdbms/standby_db/primary_db/trace/primary_db_pr00_29230.trc:

ORA-16157: media recovery not allowed following successful FINISH recovery

Managed Standby Recovery not using Real Time Apply

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE

         USING CURRENT LOGFILE DISCONNECT FROM SESSION

Recovery Slave PR00 previously exited with exception 16157

MRP0: Background Media Recovery process shutdown (primary_db)

***********************************************************************

Fatal NI connect error 12543, connecting to:

(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=primary_db)(CID=(PROGRAM=oracle)(HOST=node2.localdomain)(USER=oracle))))

  VERSION INFORMATION:

        TNS for Linux: Version 11.2.0.4.0 - Production

        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production

  Time: 23-DEC-2013 17:13:58

  Tracing not turned on.

  Tns error struct:

    ns main err code: 12543

TNS-12543: TNS:destination host unreachable

    ns secondary err code: 12560

    nt main err code: 513

TNS-00513: Destination host unreachable

    nt secondary err code: 113

    nt OS err code: 0

Mon Dec 23 17:14:01 2013

***********************************************************************

Fatal NI connect error 12543, connecting to:

(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=primary_db)(CID=(PROGRAM=oracle)(HOST=node2.localdomain)(USER=oracle))))

  VERSION INFORMATION:

        TNS for Linux: Version 11.2.0.4.0 - Production

        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production

  Time: 23-DEC-2013 17:14:01

  Tracing not turned on.

  Tns error struct:

    ns main err code: 12543

TNS-12543: TNS:destination host unreachable

    ns secondary err code: 12560

    nt main err code: 513

TNS-00513: Destination host unreachable

    nt secondary err code: 113

    nt OS err code: 0

Error 12543 received logging on to the standby

FAL[client, ARC0]: Error 12543 connecting to primary_db for fetching gap sequence

Archiver process freed from errors. No longer stopped

Mon Dec 23 17:15:07 2013

Using STANDBY_ARCHIVE_DEST parameter default value as /u02/oracle/archivelogs/primary_db

Mon Dec 23 17:19:51 2013

ARCH: Archival stopped, error occurred. Will continue retrying

ORACLE Instance primary_db - Archival Error

ORA-16014: log 4 sequence# 7 not archived, no available destinations

ORA-00312: online log 4 thread 1: '/u02/oracle/fast_recovery_area/standby_db/onlinelog/o1_mf_4_9c3tk3dy_.log'

Mon Dec 23 17:26:18 2013

RFS[1]: Assigned to RFS process 31456

RFS[1]: No connections allowed during/after terminal recovery.

Mon Dec 23 17:26:47 2013

flashback database to scn 15921680

ORA-16157 signalled during: flashback database to scn 15921680...

Mon Dec 23 17:27:05 2013

alter database recover managed standby database using current logfile disconnect

Attempt to start background Managed Standby Recovery process (primary_db)

Mon Dec 23 17:27:05 2013

MRP0 started with pid=28, OS id=31481

MRP0: Background Managed Standby Recovery process started (primary_db)

started logmerger process

Mon Dec 23 17:27:10 2013

Managed Standby Recovery starting Real Time Apply

MRP0: Background Media Recovery terminated with error 16157

Errors in file /u01/oracle/diag/rdbms/standby_db/primary_db/trace/primary_db_pr00_31486.trc:

ORA-16157: media recovery not allowed following successful FINISH recovery

Managed Standby Recovery not using Real Time Apply

Completed: alter database recover managed standby database using current logfile disconnect

Recovery Slave PR00 previously exited with exception 16157

MRP0: Background Media Recovery process shutdown (primary_db)

Mon Dec 23 17:27:18 2013

RFS[2]: Assigned to RFS process 31492

RFS[2]: No connections allowed during/after terminal recovery.

Mon Dec 23 17:28:18 2013

RFS[3]: Assigned to RFS process 31614

RFS[3]: No connections allowed during/after terminal recovery.

Do you have any advice?

Thanks!

Alex.

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on May 20 2014
Added on Dec 26 2013
5 comments
6,154 views