diagnosing lns wait on sendreq
chris_cJun 26 2012 — edited Jun 26 2012Database version:11.2.0.2
Operating system: SUSE Linux 64 Bit
Configuration: 2 node primary 2 node physical standby max availability mode.
Network: bonded 10gbit Ethernet with multiple 10gbit inter-site links round trip time 0.5ms
We have an issue with dataguard replication where 99.9% of the time response is <2ms but occasionally we see a long wait for LNS Wait on sendreq in the 2-4 second range, most of the time this is not an issue however there are a number of timing sensitive parts of the application where a delay of 4 seconds is noticeable to our customers and could be embarrassing. I am currently trying to track the exact cause of the problem and would be interested if anyone has experience of diagnosing these kinds of problems.
Currently we are planning to run some tests over the next few weekends to determine the root cause currently we plan to take a tcpdump of all four nodes, trace the application using extended sql trace and monitor the IO throughput at the standby, any suggestions on useful tools for diagnosing this kind of intermittent problem would be appreciated.
Chris