Long pauses during ParNew garbage collection Please Help !
843829Dec 7 2007 — edited Dec 10 2007Hi,
We are running a server application on an large machine (~120 CPU, ~380 GB Memory).
After running 1 or 2 hours we suddenly get exorbitant application pause times during garbage collection and a massive cpu usage from the java vm
We are running on Java 6 (64Bit) with 6GB Heap.
Concurrent garbage collection is turned on using the parameters:
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:CMSInitiatingOccupancyFraction=80
-XX:+DisableExplicitGC
We turned on verbose garbage collection and are getting the following output:
1. Normal operation:
Application time: 217.4656792 seconds
3180.905: [GC 3180.906: [ParNew
Desired survivor size 20119552 bytes, new threshold 4 (max 4)
- age 1: 2843824 bytes, 2843824 total
- age 2: 2577128 bytes, 5420952 total
- age 3: 5742024 bytes, 11162976 total
- age 4: 625672 bytes, 11788648 total
: 329531K->15764K(353920K), 0.1484379 secs] 2435799K->2122105K(3392144K), 0.1492386 secs]
Total time for which application threads were stopped: 0.1886810 seconds
2. The Problem:
Application time: 2.8858445 seconds
5008.433: [GC 5008.434: [ParNew
Desired survivor size 20119552 bytes, new threshold 2 (max 4)
- age 1: 15837712 bytes, 15837712 total
- age 2: 12284416 bytes, 28122128 total
: 348338K->39296K(353920K), 138.5317715 secs] 2487779K->2192551K(3392144K), 138.5327383 secs]
Total time for which application threads were stopped: 138.5778558 seconds
...
Application time: 2.9764564 seconds
5149.957: [GC 5149.957: [ParNew
Desired survivor size 20119552 bytes, new threshold 2 (max 4)
- age 1: 9483176 bytes, 9483176 total
- age 2: 14499344 bytes, 23982520 total
: 353920K->39296K(353920K), 231.5110574 secs] 2507175K->2204546K(3392144K), 231.5121011 secs]
Total time for which application threads were stopped: 231.5257754 seconds
...
Application time: 2.7932907 seconds
5384.277: [GC 5384.278: [ParNew
Desired survivor size 20119552 bytes, new threshold 4 (max 4)
- age 1: 10756376 bytes, 10756376 total
- age 2: 9135888 bytes, 19892264 total
: 353920K->28449K(353920K), 256.2065591 secs] 2519170K->2207651K(3392144K), 256.2076388 secs]
Total time for which application threads were stopped: 256.2221463 seconds
I can't find any significant differences in the log between fast and long running garbage collections.
I urgently need help in solving this problem !
What can I do ?