Excessive waits; latch: library cache + latch: shared pool
I'm seeing something I've never seen before. We're running a 3-node 10.2.0.3 rac cluster with asm on RHEL4. Over the past month we've started having some strange periodic spikes in wait events, and during these spikes pretty much everything in the database pauses and waits. The spikes typically last for less than a minute, but this is unnerving from a stability standpoint. I can't correlate it directly to anything I've looked at (checkpointing, sga_resize, sequence caching). I have noticed that from about 10AM to 4PM our shared pool and buffer cache seem to trade 200M of SGA back and forth. This happens across all instances and while it is certainly doing this at the times we see the concurrency spike (we use Grid Control), there are plenty of other times during the day that we don't see this.
We've also noticed that about the same time we started seeing these spikes we hit a bug (confirmed by Oracle) that causes the LISTENER to crash. It has something to do with excessive paging on the host (?). Don't know if this is related but thought to mention it. We figured when we go in to apply the patch we'd rob a little of our PGA and give it to the SGA (PGA recommendations from AWR reports say we should be OK) to perhaps alleviate the tug of war between the shared pool and buffer caches. Adding more memory is not an immediate option.
Has anyone encountered anything like this? Does anyone have any ideas about what more I could check?
Thanks.