Hi,
My Oracle DB Version is 12.1.0.2.0. I am working on a query which needs data sets to be combined from two different table sources. The source table (Primary Key: id, line_number, code) has millions of records rows but this process is supposed to run every day and it will most likely pick a few thousand records rows from the source table based on the entered_datetime, while making sure that these thousand records rows do not exist in the target table (Primary Key: id, line_number, code) - Please see the NOT EXISTS condition below.
Once we have the records rows from the source table, I need the corresponding records rows from the target table where the id matches so that few calculations (mostly analytic functions) can be run on the entire result set. So far, I have been able to come up with this:
Please note the below query is being used as an inline view in a much bigger SELECT.
SELECT id
, line_number
, code
, 'SOURCE_TABLE' source
FROM source_table src
WHERE src.entered_datetime > p_last_run_date_of_process --Replaced SYSDATE with p_last_run_date_of_process
AND NOT EXISTS (SELECT NULL
FROM target_table tgt
WHERE tgt.id = src.id
AND tgt.line_number = src.line_number
AND tgt.code = src.code)
UNION ALL
SELECT id
, line_number
, code
, 'TARGET_TABLE' source
FROM target_table tgt
;
Ideally, if from the source table, the only record rows which came out of the 1st SELECT above is with an id = 100, I would prefer to have the 2nd SELECT from the target table to run only for id = 100, so that I can avoid a full UNION ALL of the entire target table because as days go by, the size of the target_table is also going to grow and I am concerned that a UNION ALL without any filters on the target_table SELECT is not optimal.
Is there a way to restrict the target table SELECT based on the output of the source table?
Desired Output:
| ID | LINE_NUMBER | CODE | SOURCE |
|---|
| 100 | 1 | X | TARGET_TABLE |
| 100 | 2 | X | SOURCE_TABLE |
| 100 | 3 | Y | TARGET_TABLE |
| 100 | 4 | Z | TARGET_TABLE |
Please advise.
Thanks!
=========================================================================================================================================
Update 1:
DDL and DML statements below:
--DDL source_table
CREATE TABLE source_table
(
id NUMBER
, line_number NUMBER
, code VARCHAR2(3)
--other columns omitted
, entered_datetime DATE
, CONSTRAINT pk_source_table PRIMARY KEY (id, line_number, code)
);
--DDL target_table
CREATE TABLE target_table
(
id NUMBER
, line_number NUMBER
, code VARCHAR2(3)
--other columns omitted
, CONSTRAINT pk_target_table PRIMARY KEY (id, line_number, code)
);
--DML source_table
INSERT INTO source_table VALUES (10, 1, 'X', SYSDATE - 10);
INSERT INTO source_table VALUES (10, 2, 'X', SYSDATE - 10);
INSERT INTO source_table VALUES (10, 3, 'X', SYSDATE - 10);
INSERT INTO source_table VALUES (100, 1, 'X', SYSDATE + 1);
INSERT INTO source_table VALUES (100, 2, 'X', SYSDATE + 1);
INSERT INTO source_table VALUES (100, 3, 'Y', SYSDATE + 1);
INSERT INTO source_table VALUES (100, 4, 'Z', SYSDATE + 1);
INSERT INTO source_table VALUES (1000, 1, 'Z', SYSDATE + 1);
INSERT INTO source_table VALUES (1000, 2, 'Z', SYSDATE + 1);
--DML target_table
INSERT INTO target_table VALUES (10, 1, 'X');
INSERT INTO target_table VALUES (10, 2, 'X');
INSERT INTO target_table VALUES (10, 3, 'X');
INSERT INTO target_table VALUES (100, 1, 'X');
INSERT INTO target_table VALUES (100, 3, 'Y');
INSERT INTO target_table VALUES (100, 4, 'Z');
---------------------------------------
SELECT *
FROM source_table
ORDER BY id, line_number, code;
SELECT *
FROM target_table
ORDER BY id, line_number, code;
---------------------------------------
--This is what I currently have
SELECT id
, line_number
, code
, 'SOURCE_TABLE' source
FROM source_table src
WHERE src.entered_datetime > SYSDATE
AND NOT EXISTS (SELECT NULL
FROM target_table tgt
WHERE tgt.id = src.id
AND tgt.line_number = src.line_number
AND tgt.code = src.code)
UNION ALL
SELECT id
, line_number
, code
, 'TARGET_TABLE' source
FROM target_table tgt
ORDER BY id, line_number, code;
--Optimal output will be
SELECT id
, line_number
, code
, 'SOURCE_TABLE' source
FROM source_table src
WHERE src.entered_datetime > SYSDATE
AND NOT EXISTS (SELECT NULL
FROM target_table tgt
WHERE tgt.id = src.id
AND tgt.line_number = src.line_number
AND tgt.code = src.code)
UNION ALL
SELECT id
, line_number
, code
, 'TARGET_TABLE' source
FROM target_table tgt
WHERE tgt.id IN (100, 1000) --Since these are the only ids which came from the above source_table SELECT
ORDER BY id, line_number, code;
Current Output:
| ID | LINE_NUMBER | CODE | SOURCE |
|---|
| 10 | 1 | X | TARGET_TABLE |
| 10 | 2 | X | TARGET_TABLE |
| 10 | 3 | X | TARGET_TABLE |
| 100 | 1 | X | TARGET_TABLE |
| 100 | 2 | X | SOURCE_TABLE |
| 100 | 3 | Y | TARGET_TABLE |
| 100 | 4 | Z | TARGET_TABLE |
| 1000 | 1 | Z | SOURCE_TABLE |
| 1000 | 2 | Z | SOURCE_TABLE |
Optimal Output:
| ID | LINE_NUMBER | CODE | SOURCE |
|---|
| 100 | 1 | X | TARGET_TABLE |
| 100 | 2 | X | SOURCE_TABLE |
| 100 | 3 | Y | TARGET_TABLE |
| 100 | 4 | Z | TARGET_TABLE |
| 1000 | 1 | Z | SOURCE_TABLE |
| 1000 | 2 | Z | SOURCE_TABLE |
Message was edited by: 3683884