We have 9.2.0.8 database that experiences long waits on log file sync (average wait = 46 ms) while waits on log file parallel write are fine (average wait is less than 1 millisecond).
The application is middleware-type, it connects several other applications. A single user action in one application causes multiple requests to be sent back and forth through this middleware, so it needs db response time in milliseconds.
The database is quite simple:
- it has a few config tables that the application reads but rarely updates
- it has TRANSACTION_HISTORY table: the application inserts records into this table through single-row insert (approx 100 rows per second); each insert is followed by commit.
The records are kept for several months and then purged. The table has only NUMBER/VARCHAR2/DATE columns, no LOBS, LONG, etc. The table has 4 single-column non-unique indexes.
Average row length = 100 bytes.
The load profile doesn't show anything unusual, key numbers: 110 transactions per second, average transaction size = 1.5KB.
The data below is for 1 hour interval (purge wasn't running during this interval), the rates of physical reads and physical writes are low:
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 160,164.75 1,448.42
Logical reads: 57,675.25 521.58
Block changes: 934.90 8.45
Physical reads: 76.27 0.69
Physical writes: 86.10 0.78
User calls: 491.69 4.45
Parses: 321.24 2.91
Hard parses: 0.09 0.00
Sorts: 126.96 1.15
Logons: 0.06 0.00
Executes: 1,956.91 17.70
Transactions: 110.58
Top 5 events are dominated by log file sync:
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
log file sync 401,608 18,448 59.94
db file parallel write 124,044 3,404 11.06
CPU time 3,097 10.06
enqueue 10,476 2,916 9.48
db file sequential read 261,947 2,435 7.91
Events section:
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
log file sync 401,608 0 18,448 46 1.0
db file parallel write 124,044 0 3,404 27 0.3
enqueue 10,476 277 2,916 278 0.0
db file sequential read 261,947 0 2,435 9 0.7
buffer busy waits 11,467 67 173 15 0.0
SQL*Net more data to client 1,565,619 0 79 0 3.9
row cache lock 2,800 0 52 18 0.0
control file parallel write 1,294 0 45 35 0.0
log file switch completion 261 0 36 138 0.0
latch free 2,087 1,446 24 12 0.0
PL/SQL lock timer 1 1 20 19531 0.0
log file parallel write 143,739 0 17 0 0.4
db file scattered read 1,644 0 17 10 0.0
log file sequential read 636 0 8 13 0.0
Log buffer is approx 1.3 MB. We could increase the log buffer but there are no log buffer space waits, so I doubt that it will help.
Redo logs have their own filesystems, not shared with datafiles. This explains the difference between avg wait on log file parallel write (less than 1 ms) and db file parallel write (27 ms).
Redo logs are 100 MB, there are approx 120 log switches per day.
What has changed: the rate of inserts/commits has grown. Several months ago there were 25 inserts/commits per second into TRANSACTION_HISTORY table, now we get 110 inserts/commits per second.
What problem it causes to the application: due to slow response from the database the application (Java-based) requires more and more threads.
MOS documents on log file sync (for example, 1376916.1 Troubleshooting 'log file sync' waits) recommends to compare average wait on log file sync and log file parallel write.
If the values are close (e.g. log file sync = 20 ms and log file parallel write= 10 ms) then the waits are caused by slow I/O. However it isn't the case here.