Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Long-running SQL suddenly starts spending all its time reading UNDO - why?

mtefftMay 11 2015 — edited May 11 2015

We have a long-running batch SQL (running about 14 hours so far) that joins 16 tables to do a direct-path insert into another table. There is only one occurrence of this SQL running in the instance at any given time.

Here are some statistics from DBA_HIST_SQLSTAT:

BEGIN_INTERVAL_TIME END_INTERVAL_TIME EXECUTIONS   ELAPSED_SEC   CPU_SECIOWAIT_SEC ROWS_PROCESSEDBUFFER_GETS PHYSICAL_READS

------------------- ------------------- ------------ ------------- ------------- ------------- -------------- -------------- --------------

05/11/2015 00:00:28 05/11/2015 01:00:30        0    519.92    396.03    154.04          5,152,419     22,998
05/11/2015 01:00:30 05/11/2015 02:00:32        3,507.69  2,818.16    754.26          0 34,834,637     87,823
05/11/2015 02:00:32 05/11/2015 03:00:33        3,521.94  2,907.21    691.31          0 36,936,025     84,857
05/11/2015 03:00:33 05/11/2015 04:00:36        3,522.74  2,968.57    631.35          0 40,630,840     83,614
05/11/2015 04:00:36 05/11/2015 05:00:38        3,504.09  3,060.45    522.26          0 37,509,360     67,110
05/11/2015 05:00:38 05/11/2015 06:00:40        3,526.92  3,092.64    512.13          0 38,534,031     68,438
05/11/2015 06:00:40 05/11/2015 07:00:41        3,516.35  3,112.10    481.54          0 34,000,400     65,129
05/11/2015 07:00:41 05/11/2015 08:00:43        3,521.35  2,889.92    707.48          0 25,362,928     91,970
05/11/2015 08:00:43 05/11/2015 09:00:49        3,529.14  2,703.83    899.84          0 13,194,352    149,575
05/11/2015 09:00:49 05/11/2015 10:00:57        3,527.23  2,501.88  1,099.39          0 12,425,802    168,407
05/11/2015 10:00:57 05/11/2015 11:00:05        3,457.85  2,128.88  1,405.28          0 12,198,139    228,599
05/11/2015 11:00:05 05/11/2015 12:00:13        3,533.07  2,801.07    807.50          0 14,972,963    103,139
05/11/2015 12:00:13 05/11/2015 13:00:22        3,485.39    865.92  2,696.10          5,084,815    466,098
05/11/2015 13:00:22 05/11/2015 14:00:35        3,531.56    290.73  3,280.07          2,793,640    435,604 

(curse this formatting)

I have seen that Buffer Gets have been fairly consistent from run to run. I use that as an indication of how the SQL is progressing.

Note that the hourly pace of Buffer Gets drops significantly around 08:00 (probably started sometime before that) and drops again at about 12:00. However, the Physical Reads increase significantly at the same time. Its pace is nowless than a tenth of what it was earlier in the run.

I queried DBA_HIST_ACTIVE_SESS_HISTORY for this SQL_ID, and it shows that, early in the run, the events that referenced a datafile/tablespace were all accessing the tablespace for the application (so far, so good). But later, the I/O activity switches to almost entirely accesses of the UNDO tablespace.

So, my suspicions are:

1. Something updated one of those 16 tables after this query started, and the UNDO activity is incurred to reconstruct a consistent version of the changed blocks.

2. Something triggered a Delayed Block Cleanout condition on one of those 16 tables (not necessarily after this started) and the UNDO activity  is incurred trying to determine if the blocks are usable.

I have queried ASH for either of these possibilities and have come up empty. That doesn't mean it didn't happen, but ASH does not have it if it did.

I need to know if there is any other cause that could trigger this. What else should I be looking for?

This same issue (including the UNDO) occurred a few days ago, causing us to kill and re-run the job. Now it has occurred again. It has not occurred in the previous 6 months (our AWR retention)..

Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi

PL/SQL Release 10.2.0.5.0 - Production

CORE    10.2.0.5.0      Production

TNS for Solaris: Version 10.2.0.5.0 - Production

NLSRTL Version 10.2.0.5.0 - Production

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jun 8 2015
Added on May 11 2015
2 comments
6,180 views