SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Archive rows from one table to another, in batches of N at a time

Paul MooreOct 18 2010 — edited Oct 19 2010

I am trying to set up an archiving process to move rows out of a large table into an archive table. I am using Oracle 10gR2, and I do not have the partitioning option. The current process is a simple (apologies for the layout, I can't see a way of formatting code neatly for posting...)

for r in (select ... from table where ...) loop
insert into arch_table (...) values (r.a, r.b, ...);
-- error handling
delete from table where rowid = r.rowid;
-- error handling
commit_count := commit_count + 1;
if commit_count >= N then
commit;
commit_count := 0;
end if;
end loop;

I know this is not a good approach - we're looking at fixing it because it's getting the inevitable "snapshot too old" errors, apart from anything else - and I'd like to build something more robust.

I do need to only take N rows at a time - firstly, because we don't have the space to create a big enough undo tablespace to do everything at once, and secondly, because there is no business reason to insist that the archiving is done in a single transaction - it's perfectly acceptable to do "some at a time" and having multiple commits makes the process restartable while at the same time making some progress on each run.

My first thought was to use rownum in the where clause to just do a bulk insert then a bulk delete:

insert into archive_table (...) select ... from table where ... and rownum < N;
delete from table where ... and rownum < N;
commit;

(I'd need some error logging clauses in there to be able to report errors properly, but that's OK).

However, I can't find anything that convinces me that this type of use of rownum is deterministic - that is, the delete will always delete the same rows that the insert inserted (I could imagine different plans for the insert and the delete, which meant that the rows were selected in a different order). I can't think of a way to prove that this couldn't happen.

Alternatively, I could do a bulk collect to select the rows in batches, then do a bulk insert followed by a rowid-based delete. That way there's a single select, so there's no issue of mismatches, but this would potentially use a lot of client memory to hold the row set.

Does anybody have any comments or suggestions on the best way to handle this? I'd prefer a solution along the lines of the first suggestion (use rownum in the where clause) if I could find something I could be sure was reliable. I just have a gut reaction that it "should" be possible to do something like this in a single statement. I've looked briefly at esoteric uses of the merge statement to do the insert/delete in a single statement, but couldn't find anything.

It's a problem that seems to come up a lot in discussions, but I have never yet seen a decent discussion of the various tradeoffs. (Most solutions I've seen tend to either suggest "bite the bullet and do it in one transaction and give it enough undo" or "use features of the data (for example, a record ID column) to roughly partition the data into manageable sizes", neither of which would be particularly easy in this case).

Thanks in advance for any help or suggestions.
Paul

Locked Post

New comments cannot be posted to this locked post.

Locked on Nov 16 2010

Added on Oct 18 2010

#plsql, #sql

3 comments

1,733 views