Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

How to split a MERGE query into separate UPDATE and INSERT queries in batch for large dataset?

Hi,
I need to MERGE data ranging from 1GB to 100 GB from sourcetable to targettable.
source table:
empid
name
salary
dependents
status
leavestaken
lobcol1
lobcol2

TargetTable:
empid
name
salary
dependents
status
leavestaken
lobcol1
lobcol2
My current MERGE statement is as follows:
MERGE into targettable t
using sourcetable s
on (t.empid = s.empid)
when matched then
update set t.name = s.name,
t.salary = s.salary,
t.dependents = s.dependents,
t.status = s.status,
t.leavestaken = s.leavestaken
t.lobcol1 = s.lobcol1
t.lobcol2 = s.lobcol2
where (t.name <> s.name
or t.salary <> s.salary
or t.dependents <> s.dependents
or t.status <> s.status
or t.leavestaken <> s.leavestaken)
when matched then
insert (empid, name, salary, dependents, status, leavestaken, lobcol1, lobcol2)
values (s.empid, s.name, s.salary, s.dependents, s.status, s.leavestaken, s.lobcol1, s.lobcol2);
This works fine most if the data being loaded is less than 3 GB. With large amount, application throws socket readtimeout error. Reason being, I am using JDBC to execute the query from my application code. JDBC connection uses WLS Datasource connection pool with readtimeout and statement timeout set to 2 minutes.
Hence to solve this issue, I am trying following approaches and collect time taken with each approach to finalize the best option -
Option #1: Merge in batches of 50000 rows from sourcetable to target table
Example of one batch:
MERGE into targettable t
using (select * from sourcetable offset 0 rows fetch next 50000 rows only) s
on (t.empid = s.empid)
when matched then
update set t.name = s.name,
t.salary = s.salary,
t.dependents = s.dependents,
t.status = s.status,
t.leavestaken = s.leavestaken
t.lobcol1 = s.lobcol1
t.lobcol2 = s.lobcol2
where (t.name <> s.name
or t.salary <> s.salary
or t.dependents <> s.dependents
or t.status <> s.status
or t.leavestaken <> s.leavestaken)
when matched then
insert (empid, name, salary, dependents, status, leavestaken, lobcol1, lobcol2)
values (s.empid, s.name, s.salary, s.dependents, s.status, s.leavestaken, s.lobcol1, s.lobcol2);
In this approach with each batch, maximum time taken in my testing is always <30seconds depending on the number of rows being merged.
Option #2:
Split MERGE into separate UPDATE and INSERT statement. Fetch 50000 rows at a time from source table and then run separate update and insert on target table.
To achieve this, how can I limit the rows I fetch to 50000 rows at a time in my below UPDATE and INSERT query -
update targettable t set
(t.empid, t.name, t.salary, t.dependents, t.status, t.leavestaken, t.lobcol1, t.lobcol2)
= ( SELECT s.empid, s.name, s.salary, s.dependents, s.status, s.leavestaken, s.lobcol1, s.lobcol2
FROM sourcetable s
WHERE t.empid = s.empid
) WHERE EXISTS
(SELECT 1
FROM sourcetable s
WHERE s.empid = t.empid
AND t.name <> s.name
or t.salary <> s.salary
or t.dependents <> s.dependents
or t.status <> s.status
or t.leavestaken <> s.leavestaken)

insert into targettable t (t.empid, t.name, t.salary, t.dependents, t.status, t.leavestaken, t.lobcol1, t.lobcol2)
select s.empid, s.name, s.salary, s.dependents, s.status, s.leavestaken, s.lobcol1, s.lobcol2
FROM sourcetable s, targettable t
where s.empid = t.empid(+)
and t.empid is null;

Also is there any other alternate best option to try. And i need to load this data from source table to target table when target table is actively being used by other users to perform any UPDATE/DELETE/INSERT.
Thanks

Comments
Post Details
Added on Nov 16 2020
4 comments
3,802 views