ORACLE 11.2.0.4 RAC 网络恢复后VIP不回切
发布时间:2021-09-30
Intermittently VIP Failback does not work after the Network Connection is Restored (文档 ID 1992370.1)
In this Document
Symptoms
Cause
Solution
References
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.4 and later
Oracle Net Services - Version 12.1.0.2 to 12.1.0.2 [Release 12.1]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Information in this document applies to any platform.
SYMPTOMS
In a 2-node RAC cluster, perform a VIP failover test. After pulling the public network cable on node 2, VIP of node 2 successfully fails over to node 1 as expected. Reconnect the network cable on node 2. Intermittently VIP of node 2 does not fail back to node 2, it becomes OFFLINE. Manually using srvctl to start the VIP of node 2 works fine. Sometimes it fails back to node 2 successfully without intervention.
1. Pulling a cable to the public NIC of Node#2
Jun 9 21:41:29 EVENT START: network_down <hostname> net_ether_01
Jun 9 21:41:29 EVENT COMPLETED: network_down <hostname> net_ether_01 0
Jun 9 21:41:37 EVENT START: network_down_complete <hostname> net_ether_01
Jun 9 21:41:37 EVENT COMPLETED: network_down_complete <hostname> net_ether_01
2. VIP of node 2 fails over to node 1 as expected:
Mon Jun 9 21:41:03 KST 2014
------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
------------------------------------------------------------------------------
ora.<host>.vip
1 ONLINE INTERMEDIATE <host> FAILED OVER
3. Reconnect cable to the NIC of Node#2
Jun 9 21:44:15 EVENT START: network_up <hostname> net_ether_01
Jun 9 21:44:15 EVENT COMPLETED: network_up <hostname> net_ether_01 0
Jun 9 21:44:15 EVENT START: network_up_complete <hostname> net_ether_01
Jun 9 21:44:15 EVENT COMPLETED: network_up_complete <hostname> net_ether_01 0
4. The Vip of node 2 does not failback, instead it becomes OFFLINE
Mon Jun 9 21:56:23 KST 2014
------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
------------------------------------------------------------------------------
ora.<host>.vip
1 OFFLINE OFFLINE
5. Using srvctl start nodeapps, the VIP is started fine:
Mon Jun 9 21:56:38 KST 2014
------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
------------------------------------------------------------------------------
ora.<host>.vip
1 ONLINE ONLINE racnode2
<GRID_HOME>/log/<node>/agent/crsd/orarootagent_root/orarootagent_root.log shows:
Oracle Database 11g Clusterware Release 11.2.0.4.0 - Production Copyright 1996, 2011 Oracle. All rights reserved.
2014-06-09 21:42:56.882: [ora.net1.network][4507]{0:1:17379} [check] ioctl Error
2014-06-09 21:42:56.883: [ AGFW][2057]{0:1:17379} Agent sending last reply for: RESOURCE_PROBE[ora.net1.network <hostname> 1] ID 4097:790184
2014-06-09 21:42:56.885: [ AGFW][2057]{0:1:17379} Agent received the message: RESOURCE_PROBE[ora.net1.network <hostname> 1] ID 4097:790187
2014-06-09 21:42:56.885: [ AGFW][2057]{0:1:17379} Preparing CHECK command for: ora.net1.network <hostname> 1
2014-06-09 21:42:56.886: [ora.net1.network][1166]{0:1:17379} [check] ioctl Error
2014-06-09 21:42:56.887: [ora.net1.network][1166]{0:1:17379} [check] (null) category: -1, operation: failed system call, loc: ioctl, OS error: 6, other:
2014-06-09 21:42:56.887: [ora.net1.network][1166]{0:1:17379} [check] clsn_agent::check: Exception IoctlException
2014-06-09 21:42:56.887: [ora.net1.network][1166]{0:1:17379} [check] ioctl Error
......
When there is no VIP failback, ora.net1.network status on node 2 shows INTERMEDIATE instead of ONLINE.
CAUSE
This is reported in BUG 19126172 - FAILBACK FROM NODE 1 TO NODE 2 NOT HAPPENING
Closed as unpublished BUG 17059927 - ORA.NET1.NETWORK KEEP INTERMEDIATE AFTER DOWN PUBLIC NIC
This is code defect, basically when ioctl fails we just re-throw the error this led to the network resource status keeps INTERMEDIATE and prevents vip from failback.
SOLUTION
The Bug 17059927 has been fixed in 11.0.2.0.4.5 PSU and Windows 11.2.0.4.12 Bundle and 12.1.0.2.
Please apply the PSU or bundle patch where the fix is included.