Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Increase in avg log file sync wait while log file parallel write wait remained normal

Vsevolod AfanassievJan 20 2020 — edited Jan 22 2020

Oracle 12.1 on AIX, single instance

This is large (30 TB) and busy database running mix of OLPT and batch. Commit rate is low (60 per second), avg wait on log file sync is 20 - 30 ms. Normally waits on LFS are less than 5% of total waits. Today for a brief period LFS waits became very slow:

- Statspack repor for 20 min interval shows avg wait on LFS = 303 ms.  No other abnormal waits during this interval

- Waits on log file parallel write remained normal at 11 ms

- Other metrics were normal: transactions per second = 60, redo per transaction = 50K

- No significant changes to the logical reads (1 million per sec), physical reads (100K per sec), etc

Looking at active session history I can see that normally we get less than 5 sessions waiting on LFS. Today during 1 min  interval the number of sessions waiting on LFS exceeded 600. It happened out of the blue, there were no ORA-00600 or ORA-07445. At the LPAR level CPU and memory utilization remained normal. There is a trace file for log writer in /trace, but it hasn't been updated for 20 hours.

Histogram distribution:

                           Total ----------------- % of Waits ------------------

Event                      Waits  <1ms  <2ms  <4ms  <8ms <16ms <32ms  <=1s   >1s

-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----

log file sync                82K    .3   1.0   4.0  10.2  22.4  35.6  25.2   1.3

This database has two underscore parameters set to non-default values:

_use_adaptive_log_file_sync = FALSE

-use_single_log_writer = TRUE

I had a look in ASH trying to determine the event LGWR was waiting for: it didn't get populated for the interval

when LFS waits were slow.

Looking at v$session_event for LGWR session: seems healthy

EVENT                                    TOTAL_WAITS TIME_WAITED_MICRO

---------------------------------------- ----------- -----------------

direct path write                                  3             12969

control file single write                       2332          11430662

Disk file operations I/O                        3337          20695575

latch: redo writing                            28868          27397622

log file sequential read                       55915         198817936

log file single write                          55915         199806989

control file parallel write                  4083621        1.7887E+10

events in waitclass Other                   49319169        1.8559E+10

control file sequential read                13262044        3.1336E+10

rdbms ipc message                          773063332        3.5595E+12

log file parallel write                   1002356527        4.2436E+12

What could have possible caused it? what should we do if it happens again?

Comments

Processing

Post Details

Added on Jan 20 2020
9 comments
700 views