Hi All,
I'm going to try to make my question as simple as possible: I'm trying to use the service guardian to automatically detect an abandoned thread and kill the originating process. So basically I set up a test that ends up with an abandoned thread and configured service Guardian... but it didn't work :D
What I first did is setting the guardian's timeout to be very short (6 seconds). When I do that, what I get is:
-----
2010-03-26 10:03:29.389/0.859 Oracle Coherence GE 3.5.3/465 <Info> (thread=main, member=n/a): Loaded cache configuration from "file:/C:/workspaces/POC/InvocableTest/bin/coherence-cache-config.xml"
2010-03-26 10:03:29.952/1.422 Oracle Coherence GE 3.5.3/465 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a
2010-03-26 10:03:35.295/6.765 Oracle Coherence GE 3.5.3/465 <Error> (thread=Cluster, member=n/a): Attempting recovery (due to soft timeout) of Guard{Daemon=TcpRingListener}
2010-03-26 10:03:35.904/7.374 Oracle Coherence GE 3.5.3/465 <Error> (thread=Cluster, member=n/a): Terminating guarded execution (due to hard timeout) of Guard{Daemon=TcpRingListener}
Coherence <Error>: Halting JVM due to unrecoverable service failure
2010-03-26 10:03:36.904/8.374 Oracle Coherence GE 3.5.3/465 <Error> (thread=Termination Thread, member=n/a): Full Thread Dump
Thread
Cluster
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onWait(Grid.CDB:6)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Unknown Source)
.
.
.
-----
So that's before the thread gets abandoned... it seems that it detects a deadlock when starting coherence. But that's ok, I know the service guardian is working. The problem comes when I set the timeout to something more realistic: 35 seconds for instance. I run my test, the thread gets to an abandoned state and the service guardian does not see it :(
Any Ideas?
h4. My override config is:
-----
<coherence>
<cluster-config>
<service-guardian>
<timeout-milliseconds system-property="tangosol.coherence.guard.timeout">6000</timeout-milliseconds>
<service-failure-policy>exit-process</service-failure-policy>
</service-guardian>
</cluster-config>
<logging-config>
<severity-level system-property="tangosol.coherence.log.level">5</severity-level>
<character-limit system-property="tangosol.coherence.log.limit">0</character-limit>
</logging-config>
</coherence>
-----
Thanks!
Fernando
Edited by: ZeoS on Mar 26, 2010 7:13 AM