Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback, please email oracle-forums_moderators_us@oracle.com

Avoid MERGE to load DUPLICATE data

Prasanna KondruFeb 21 2024

Hi All,

I am trying to load data from staging table to master table. the criteria is, load rows whose column data is modified when compare the existing row in the master table. if not, do nothing. So I opted out for MERGE. but for my surprise, it is always inserting all rows from staging to master.

Master Table: REPAIR_STATS_MASTER & Staging Table: REPAIR_STATS_STAGING

My SQL is:

MERGE into REPAIR_STATS_MASTER M
using REPAIR_STATS_STAGING S
on ( M.CLUSTER_ID = S.CLUSTER_ID
AND M.CLUSTER_NAME = S.CLUSTER_NAME
AND M.KEYSPACE_NAME = S.KEYSPACE_NAME
AND M.REPAIR_ID = S.REPAIR_ID
AND M.STATE = S.STATE
AND M.INTENSITY = S.INTENSITY
AND M.START_TIME = S.START_TIME
AND M.PAUSE_TIME = S.PAUSE_TIME
AND M.END_TIME = S.END_TIME
AND M.DURATION = S.DURATION
AND M.REPAIR_THREAD_COUNT = S.REPAIR_THREAD_COUNT
AND M.LAST_EVENT = S.LAST_EVENT
AND M.SEGMENTS_REPAIRED = S.SEGMENTS_REPAIRED
AND M.TOTAL_SEGMENTS = S.TOTAL_SEGMENTS
AND M.REPAIR_PARALLELISM = S.REPAIR_PARALLELISM
AND M.INCREMENTAL_REPAIR = S.INCREMENTAL_REPAIR)
WHEN MATCHED THEN
update set M.CAPTURE_DATE=S.CAPTURE_DATE where 1=2
WHEN NOT MATCHED THEN
insert (M.CAPTURE_DATE, M.CLUSTER_ID, M.CLUSTER_NAME, M.KEYSPACE_NAME, M.REPAIR_ID, M.STATE, M.INTENSITY, M.START_TIME, M.PAUSE_TIME, M.END_TIME, M.DURATION, M.REPAIR_THREAD_COUNT, M.LAST_EVENT, M.SEGMENTS_REPAIRED, M.TOTAL_SEGMENTS, M.REPAIR_PARALLELISM, M.INCREMENTAL_REPAIR) values (S.CAPTURE_DATE, S.CLUSTER_ID, S.CLUSTER_NAME, S.KEYSPACE_NAME, S.REPAIR_ID, S.STATE, S.INTENSITY, S.START_TIME, S.PAUSE_TIME, S.END_TIME, S.DURATION, S.REPAIR_THREAD_COUNT, S.LAST_EVENT, S.SEGMENTS_REPAIRED, S.TOTAL_SEGMENTS, S.REPAIR_PARALLELISM, S.INCREMENTAL_REPAIR);

Execution Output:

SQL> select count(*) from REPAIR_STATS_MASTER ;

COUNT(*)
----------
0

SQL> select count(*) from CASS_REPAIR_STATS_STAGING ;

COUNT(*)
----------
2066

SQL> MERGE into PSGMGR.CASS_REPAIR_STATS_X R
2 using PSGMGR.STG_CASS_REPAIR_STATS S
.
.

2066 rows merged.

SQL> MERGE into PSGMGR.CASS_REPAIR_STATS_X R
2 using PSGMGR.STG_CASS_REPAIR_STATS S
.
.

2066 rows merged.

SQL> select count(*) from PSGMGM.CASS_REPAIR_STATS_X;

COUNT(*)
----------
4132

My point is, when I execute 1st time, it should load 2066 rows but on the 2nd execution, it should not insert the data as the same data exists in the master table.

This post has been answered by Frank Kulash on Feb 21 2024
Jump to Answer
Comments
Post Details
Added on Feb 21 2024
5 comments
94 views