So this is on Oracle 11.2.0.2 on Solaris. High level we have a fairly large table on one of our dev Oracle instances, 430mil rows, where the data was populated by a large data load. Two of our date fields were left NULL so we are having to go back and update every row after the fact with SYSDATE values (some downstream apps require having some kind of date in this field).
Anyways I wrote a script for it (truncated slightly for brevity):
DECLARE
...
ln_limit NUMBER := 10000; -- Max amount of records to process in a single bulk update statement
lv_total_rows INTEGER := 0;
lv_count NUMBER;
lv_rowid VARCHAR(18);
CURSOR tco_cursor IS
SELECT ROWID
FROM LXRO
WHERE LxModDate IS NULL;
TYPE t_rows IS TABLE OF ROWID;
lv_rowids t_rows;
BEGIN
...
OPEN tco_cursor;
LOOP
FETCH tco_cursor BULK COLLECT INTO lv_rowids LIMIT ln_limit;
FORALL i IN 1 .. lv_rowids.COUNT
UPDATE LXRO
SET LxModDate = SYSDATE
WHERE ROWID = lv_rowids(i);
... -- (writes to log table here)
COMMIT;
EXIT WHEN lv_rowids.COUNT < ln_limit;
END LOOP;
CLOSE tco_cursor;
COMMIT;
EXCEPTION
WHEN OTHERS THEN
dbms_output.put_line(TO_CHAR(lv_rowid) || ' - ' || sqlerrm || ' - ' || TO_CHAR (ln_counter));
END;
/
The job only runs for an hour and updates 37mil rows in a single run, then gets the following error:
ORA-01555: snapshot too old: rollback segment number 30 with name "_SYSSMU30_4136132754$" too small
I did some investigation on the error and read something interesting:
Don't fetch across commits. That is, don't fetch on a cursor that was opened prior to the last commit, particularly if the data queried by the cursor is being changed in the current session.
Also if the query is a loop with a commit in it, it can do the same thing without other queries, as eventually the next iteration requires looking back at it's own previous first generation, can't do it, and barfs.
Upping undo_retention may help, or may not, depending on the real cause. Also check v$undostat, you may still have information in there if this is ongoing (or may not, since by the time you check it the needed info may be gone).
I think our UNDO tablspace has 16 GB right now. I wouldn't think trying to hold 400+ million rows in temp would be feasible, which is why I am committing every 10k rows, plus we were running into memory issues with a non-bulk version of this script before. This is only going to be a one time process (we have fixed our data migration to populate these fields in future iterations). Am I best off to just babysit this job and keep re-running it during off hours every few hours given the amount of data? Is it naive to think I could perform this many updates without errors in a single job run?