Integration

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Clear cache operation blows up the cluster

754801Dec 7 2011 — edited Jan 5 2012

Hi,
We are running a 9 node coherence 3.6 cluster and every night we have a java process that starts up, issues a cache.clear() command on each cache and terminates itself. This appears to be causing adverse effects on the cluster health. Right after this process enters and leaves, several nodes run out of memory and quits. See error below

2011-12-07 03:05:18.786/57714.576 Oracle Coherence GE 3.6.1.0 <D5> (thread=Cluster, member=1): Service guardian is 23184ms late, indicating that this JVM may be running slowly or experienced a long GC
2011-12-07 03:05:20.518/57716.308 Oracle Coherence GE 3.6.1.0 <Error> (thread=DistributedCache:OA-DistributedCache, member=1): Terminating PartitionedCache due to unhandled exception: java.lang.OutOfMemoryError
2011-12-07 03:05:20.518/57716.308 Oracle Coherence GE 3.6.1.0 <Error> (thread=DistributedCache:OA-DistributedCache, member=1):
java.lang.OutOfMemoryError: Java heap space
at com.tangosol.coherence.component.net.memberSet.ActualMemberSet.setMember(ActualMemberSet.CDB:11)
at com.tangosol.coherence.component.net.memberSet.ActualMemberSet.add(ActualMemberSet.CDB:6)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.getOwnershipMemberSet(PartitionedService.CDB:13)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.getOwnershipSenior(PartitionedService.CDB:10)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.checkDistribution(PartitionedService.CDB:71)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.onNotify(PartitionedService.CDB:15)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onNotify(PartitionedCache.CDB:3)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
at java.lang.Thread.run(Unknown Source)

This creates a ripple effect to all members and the cluster goes down after thsi happens. Can you please let me know if you have seen this or what might be causing this?

Sairam

Edited by: SKR on Dec 7, 2011 9:47 AM

Locked Post

New comments cannot be posted to this locked post.

Locked on Feb 2 2012

Added on Dec 7 2011

#coherence, #coherence-support

13 comments

1,479 views