I am working on a stateless portal and having throughput problem, could someone please give me some idea? Thanks in advance.
The application is using concurrent package, which I have identified to have premature promotion with normal GC setting. Jstat shows that some objects are being promoted from Eden space to survivor space and to OldGen pretty quick, and jmap shows those are simply Node objects used by LinkedBlockingQueue in concurrent package. So Full GC will happen quite often, which is not acceptable in a busy portal environment.
I tried to do many GC fine tunes, but none will stop premature promotion except setting NeverTenure. The side effect is minor GC will be too long, since all object will stay in New space and every minor GC will take long time to scan through. Then I designed a set of parameters to try to slow down premature promotion. Under QA environment it works very well, however, in product environment it turns out every minor GC will take 100ms, which is too long in real time environment.
Here is the GC parameters before change: -XX:+UseParallelGC -XX:ParallelGCThreads=6, and memory is 1024m.
Its minor GC is only 6ms, but causing Full GC.
Here is new GC setting: -XX:PermSize=40m -XX:-UseAdaptiveSizePolicy -XX:+UseParallelOldGC -
XX:MaxGCPauseMillis=2 -XX:MaxGCMinorPauseMillis=1 -XX:SurvivorRatio=10 -XX:NewSize=850m, and memory is 2100m.
The reason to turn off AdaptiveSizePolicy is otherwise JVM will resize NEWSIZE and gen promotion will lose control. But then minor GC happens every 9 to 15 seconds and last 100 ms. In QA environment, it takes 6ms for every 3 seconds. As stated, the application is stateless, any new object should be collected within 1 second, so I do not know why concurrent objects survive so long and why minor GC takes so long. One output from jstat:
S0 S1 E YGCT
0 20.74 99.36 20.281
20.56 0.0 1.91 20.423
and capacity: S0=S1=72.5M, E=725M, O=1280M.
And advice is highly appreciated!