Skip to Main Content

Integration

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Detecting (And Killing) Abandoned Threads

762901Mar 26 2010 — edited Mar 30 2010
Hi All,
I'm going to try to make my question as simple as possible: I'm trying to use the service guardian to automatically detect an abandoned thread and kill the originating process. So basically I set up a test that ends up with an abandoned thread and configured service Guardian... but it didn't work :D
What I first did is setting the guardian's timeout to be very short (6 seconds). When I do that, what I get is:
-----
2010-03-26 10:03:29.389/0.859 Oracle Coherence GE 3.5.3/465 <Info> (thread=main, member=n/a): Loaded cache configuration from "file:/C:/workspaces/POC/InvocableTest/bin/coherence-cache-config.xml"
2010-03-26 10:03:29.952/1.422 Oracle Coherence GE 3.5.3/465 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a
2010-03-26 10:03:35.295/6.765 Oracle Coherence GE 3.5.3/465 <Error> (thread=Cluster, member=n/a): Attempting recovery (due to soft timeout) of Guard{Daemon=TcpRingListener}
2010-03-26 10:03:35.904/7.374 Oracle Coherence GE 3.5.3/465 <Error> (thread=Cluster, member=n/a): Terminating guarded execution (due to hard timeout) of Guard{Daemon=TcpRingListener}
Coherence <Error>: Halting JVM due to unrecoverable service failure
2010-03-26 10:03:36.904/8.374 Oracle Coherence GE 3.5.3/465 <Error> (thread=Termination Thread, member=n/a): Full Thread Dump

ThreadCluster
java.lang.Object.wait(Native Method)
com.tangosol.coherence.component.util.Daemon.onWait(Daemon.CDB:18)
com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onWait(Grid.CDB:6)
com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39)
java.lang.Thread.run(Unknown Source)
.
.
.
-----

So that's before the thread gets abandoned... it seems that it detects a deadlock when starting coherence. But that's ok, I know the service guardian is working. The problem comes when I set the timeout to something more realistic: 35 seconds for instance. I run my test, the thread gets to an abandoned state and the service guardian does not see it :(
Any Ideas?

h4. My override config is:
-----
<coherence>

<cluster-config>

<service-guardian>
<timeout-milliseconds system-property="tangosol.coherence.guard.timeout">6000</timeout-milliseconds>
<service-failure-policy>exit-process</service-failure-policy>
</service-guardian>
</cluster-config>

<logging-config>
<severity-level system-property="tangosol.coherence.log.level">5</severity-level>
<character-limit system-property="tangosol.coherence.log.limit">0</character-limit>
</logging-config>
</coherence>

-----

Thanks!
Fernando

Edited by: ZeoS on Mar 26, 2010 7:13 AM
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Apr 27 2010
Added on Mar 26 2010
5 comments
2,426 views