10.2.0.2 aix 5.3 64bit archivelog mode.
I'm going to attempt to describe the system first and then outline the issue: The database is about 1Gb in size of which only about 400Mb is application data. There is only one table in the schema that is very active with all transactions inserting and or updating a row to log the user activity. The rest of the tables are used primarily for reads by the users and periodically updated by the application administrator with application code. There's about 1.2G of archive logs generated per day, from 3 50Mb redo logs all on the same filesystem.
The problem: We randomly have issues with users being kicked out of the application or hung up for a period of time. This application is used at a remote site and many times we can attribute the users issues to network delays or problems with a terminal server they are logging into. Today however they called and I noticed an abnormally high amount of 'log file sync' waits.
I asked the application admin if there could have been more activity during that time frame and more frequent commits than normal, but he says there was not. My next thought was that there might be an issue with the IO sub-system that the logs are on. So I went to our aix admin to find out the activity of that file system during that time frame. She had an nmon report generated that shows the RAID-1 disk group peak activity during that time was only 10%.
Now I took two awr reports and compared some of the metrics to see if indeed there was the same amount of activity, and it does look like the load was the same. With the same amount of activity & commits during both time periods wouldn't that lead to it being time spent waiting on writes to the disk that the redo logs are on? If so, why wouldn't the nmon report show a higher percentage of disk activity?
I can provide more values from the awr reports if needed.
per sec per trx
Redo size: 31,226.81 2,334.25
Logical reads: 646.11 48.30
Block changes: 190.80 14.26
Physical reads: 0.65 0.05
Physical writes: 3.19 0.24
User calls: 69.61 5.20
Parses: 34.34 2.57
Hard parses: 19.45 1.45
Sorts: 14.36 1.07
Logons: 0.01 0.00
Executes: 36.49 2.73
Transactions: 13.38
Redo size: 33,639.71 2,347.93
Logical reads: 697.58 48.69
Block changes: 215.83 15.06
Physical reads: 0.86 0.06
Physical writes: 3.26 0.23
User calls: 71.06 4.96
Parses: 36.78 2.57
Hard parses: 21.03 1.47
Sorts: 15.85 1.11
Logons: 0.01 0.00
Executes: 39.53 2.76
Transactions: 14.33
Total Per sec Per Trx
redo blocks written 252,046 70.52 5.27
redo buffer allocation retries 7 0.00 0.00
redo entries 167,349 46.82 3.50
redo log space requests 7 0.00 0.00
redo log space wait time 49 0.01 0.00
redo ordering marks 2,765 0.77 0.06
redo size 111,612,156 31,226.81 2,334.25
redo subscn max counts 5,443 1.52 0.11
redo synch time 47,910 13.40 1.00
redo synch writes 64,433 18.03 1.35
redo wastage 13,535,756 3,787.03 283.09
redo write time 27,642 7.73 0.58
redo writer latching time 2 0.00 0.00
redo writes 48,507 13.57 1.01
user commits 47,815 13.38 1.00
user rollbacks 0 0.00 0.00
redo blocks written 273,363 76.17 5.32
redo buffer allocation retries 6 0.00 0.00
redo entries 179,992 50.15 3.50
redo log space requests 6 0.00 0.00
redo log space wait time 18 0.01 0.00
redo ordering marks 2,997 0.84 0.06
redo size 120,725,932 33,639.71 2,347.93
redo subscn max counts 5,816 1.62 0.11
redo synch time 12,977 3.62 0.25
redo synch writes 66,985 18.67 1.30
redo wastage 14,665,132 4,086.37 285.21
redo write time 11,358 3.16 0.22
redo writer latching time 6 0.00 0.00
redo writes 52,521 14.63 1.02
user commits 51,418 14.33 1.00
user rollbacks 0 0.00 0.00
Edited by: PktAces on Oct 1, 2008 1:45 PM