Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Oracle Processes consuming all allocated RAM, freezing node completely!

ThaTripionAug 11 2016 — edited Aug 19 2016

Hello all,

I have a 3 node production system running 4 databases 11.2.0.4.5 running on 96GB of RAM, just recently we had a multiple nodes evictions and restarts.

After analysis, and running OraChk utility I added 2 kernel parameters and now node eviction has stopped:


vm.min_free_kbytes=524288
vm.swappiness=100

Even before adding these kernel parameters we noticed a high memory consumption caused by oracle processes, that it fills all available memory of the server to a limit that cluster cannot communicate and eventually cause complete freeze to the server.

I have made some analysis, and again from orachk report I found the following points are highly related to the current situation:

Hugepages are not being used by database

AND

PGA allocation for all databases is more than total memory available on this system

For huge pages issue, each node is having 96GB of RAM, and the shared memory and huge pages settings is as following:

kernel.shmmni = 4096
kernel.shmmax = 50682953728
kernel.shmall = 25165824

vm.nr_hugepages = 23067

Total SGA and PGA is as following:

SGA Total for all DBs

PGA Total for all DBs

48,318,382,080

15,049,162,752

This is the output 'grep Huge /proc/meminfo' on one of the nodes:

HugePages_Total: 23067

HugePages_Free: 1063

HugePages_Rsvd: 1041

Hugepagesize: 2048 kB

My concerns are:

1. Why huge pages is not being used? are the above calculations for PGA and SGA should be considered in order to set hugepages?

2. Why oracle processes consuming all server resources while I made sure it doesn't cross 10% of the system memory? and also how to correct this if my calculations are wrong?

3. Why in orachk tool it multiple the PGA_AGGREGATE_TARGET by 3? and what would be the right calculations for PGA if this is really overestimated for the server available resources?

Many Thanks in advance for your helping into this as it causes multiple node outage and restarts...

Locked Post

New comments cannot be posted to this locked post.

Locked on Sep 16 2016

Added on Aug 11 2016

#general-database-discussions

21 comments

14,377 views