Berkeley DB Family

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Huge memory pages and the off-heap cache

Greybird-OracleAug 31 2015 — edited Aug 31 2015

Hi,

This is the first of a couple follow-up posts to the recent JE 6.4.9 announcement. If you try out the new off-heap cache (it is optional) you will notice changes in behavior, and I'd like to share my experiences performing large scale performance tests.

When objects are evicted from the JE main (on-heap) cache, they are added to the off-heap cache. This means that eviction from the main cache never logs dirty objects, and is therefore very fast. With a large off-heap cache, the rate of object copying between the two caches can be high, and this can change CPU utilization, GC behavior, thread scheduling, etc. The performance benefit -- reduced I/O -- far outweighs any negative impacts, at least in my tests, but the behavioral changes may require new performance tuning for your application.

This first post is about huge pages. If you have other questions about the off-heap cache or other changes in 6.4.9, please start a thread and we'll be happy to help you.

--mark

Long GC Pauses due to Transparent Huge Pages

During performance testing of JE with an off-heap cache I ran into a performance problem that appears to be related to the use of Transparent Huge Pages (THP). I noticed long young generation GC pauses where the system CPU time during GC was higher than the user CPU time.

To check for this symptom, look in the GC log for long pauses corresponding to the performance drop. The following line will show system time vs user time.

[Times: user=XXX sys=YYY, real=ZZZ secs]

For long pauses, if you see sys times much higher than the user times, this may be the THP problem. (Note that you may also see high sys times at the beginning of the run if you don't specify the -XX:+AlwaysPreTouch JVM option, and this is simply due to the initial page allocations. We always specify the -XX:+AlwaysPreTouch option in our tests.)

Disabling THP solved the problem for me. THP is disabled by setting /sys/kernel/mm/transparent_hugepage/enabled to 'never'. THP is often enabled on Linux by default. For a production system, be it is disabled at boot time.

See:

http://unix.stackexchange.com/questions/99154/disable-transparent-hugepages

There are warnings in many places on the web that THP can cause performance problems and many recommendations that they be disabled.For example:

http://docs.mongodb.org/manual/tutorial/transparent-huge-pages/

It may also be sufficient to only disable THP defrag by setting /sys/kernel/mm/transparent_hugepage/defrag to 'never', since the defrag is thought to be the cause the problem. That way, THP could still be used for apps that benefit from it, although it will be somewhat less useful of course without defrag running.Here is a reference that recommends disabling defrag only:

http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_admin_performance.html#xd_583c10bfdbd326ba-7dae4aa6-147c30d0933--7fd5__section_hw3_sdf_jq

Note that the THP problem I encountered may not occur with smaller memory sizes, or a lower throughput workload, than in our tests. We used a 31GiB Java heap and a 200GiB off-heap cache, and operation throughput was in the 40 to 60k ops/sec range.By the way, I also found that THP is not used for off-heap cache allocations (Unsafe.allocateMemory), no matter what JVM options were specified.

Performance Improvement with a Huge Page Pool

Configuring a Huge Page Pool (HPP), on the other hand, was beneficial to JE performance in my tests. I saw a 10% throughput increase in some tests.Two configuration steps are necessary: Configure the size of the huge page pool on Linux, and pass -XX:+UseLargePages to the JVM. The JVM -XX:+UseHugeTLBFS option should also work, but I didn't try it.See:

In my test I configured the size of the huge page pool as a little larger than the JVM heap size I was using (31GiB). My page size was 2048KiB. To find your page size, "cat /proc/meminfo | grep Huge" as described in the references above. I configured a pool size of 17,000, which is 33GiB. A smaller number of huge pages (16,500) could have be used to solve the problem, and I used a pool size larger than necessary just to be sure that huge pages were always used.

Note that the JE off-heap cache (which uses Unsafe.allocateMemory) does not use the memory in the huge page pool. Therefore, when using an off-heap cache you should try not to allocate a huge page pool larger than the JVM heap size. The same thing is true if you have other processes running that need large amounts of memory and cannot use the memory in the huge page pool.

If you do not allocate a huge page pool, and you pass -XX:+UseLargePages (or -XX:+UseHugeTLBFS) to the JVM, you'll get a warning message such as the following:

Java HotSpot(TM) 64-Bit Server VM warning: Failed to reserve shared memory (errno = 1).

The drawback of using a HPP is the extra configuration needed, especially since the size of the pool must be configured to fit the size of the Java heap.

Locked Post

New comments cannot be posted to this locked post.

Locked on Sep 28 2015

Added on Aug 31 2015

#berkeley-db-java-edition

0 comments

1,692 views