We recently migrated an Oracle database from one VM host to another. This was done via Data Guard Physical standby so the OS and Database stayed the same – RedHat7/Oracle19. This keeps things apples-to-apples with the exception of the underlying VM Host and that the new VM has 40 CPUs versus 6 on the original.
After migrating we’ve noticed performance degradation that we are trying to narrow down.
I examined AWR reports before and after using same day of the week and same time range to get as close a match as possible in workload.
There are two things that stand out significantly:
#1 Read IO Requests changed from 148 per second to 22.4 per second.
Top ADDM Findings by Average Active Sessions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ --------------- --------------- --------- ---------
Read IO requests: (pre-mig) 148.0 18.7
Read IO requests: (post-mig) 22.4 2.6
#2 DB CPU total Wait Time changed from 28.8K to 122.6K seconds.
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Total Wait Avg % DB Wait
Event Waits Time (sec) Wait time Class
------------------------------ ----------- ---------- --------- ------ --------
DB CPU (pre-mig) 28.8K 75.5
DB CPU (post-mig) 122.6K 99.3
I’d like to be sure I correctly understand what I’m seeing.
On #1 - We are confident that I/O is faster on the new host so I’m a bit confused why requests per second would have changed in this way unless due to faster I/O we are able to make more requests. Is this saying we are making more requests than we used to or is it showing we are getting more done?
On #2 – The CPU on both VM Hosts are the same Haswell architecture. The difference is the old VM had 6 and the new has 40. This AWR data indicates wait time for CPU requests has increased which would indicate a CPU issue; however, (from reading articles) I’m unsure if this actually is the result of having 40 CPUs instead of 6 and so amount of time waiting with no activity is simply higher (40/6=6.67, thus 28.8K*6.67=192K but that is clearly much more than 122.6K.) Is “DB CPU” time spent waiting on CPU requests from the database or is it CPU time spent on work + idle?