SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Combine Separate Result Sets Into One

3683884Sep 12 2020 — edited Sep 13 2020

Hi,

My Oracle DB Version is 12.1.0.2.0. I am working on a query which needs data sets to be combined from two different table sources. The source table (Primary Key: id, line_number, code) has millions of records rows but this process is supposed to run every day and it will most likely pick a few thousand records rows from the source table based on the entered_datetime, while making sure that these thousand records rows do not exist in the target table (Primary Key: id, line_number, code) - Please see the NOT EXISTS condition below.

Once we have the records rows from the source table, I need the corresponding records rows from the target table where the id matches so that few calculations (mostly analytic functions) can be run on the entire result set. So far, I have been able to come up with this:

Please note the below query is being used as an inline view in a much bigger SELECT.

SELECT id
     , line_number
     , code
     , 'SOURCE_TABLE' source
FROM source_table src
WHERE src.entered_datetime > p_last_run_date_of_process --Replaced SYSDATE with p_last_run_date_of_process
     AND NOT EXISTS (SELECT NULL
                    FROM target_table tgt
                    WHERE tgt.id = src.id
                         AND tgt.line_number = src.line_number
                         AND tgt.code = src.code)
UNION ALL
SELECT id
     , line_number
     , code
     , 'TARGET_TABLE' source
FROM target_table tgt
;

Ideally, if from the source table, the only record rows which came out of the 1st SELECT above is with an id = 100, I would prefer to have the 2nd SELECT from the target table to run only for id = 100, so that I can avoid a full UNION ALL of the entire target table because as days go by, the size of the target_table is also going to grow and I am concerned that a UNION ALL without any filters on the target_table SELECT is not optimal.

Is there a way to restrict the target table SELECT based on the output of the source table?

Desired Output:

ID	LINE_NUMBER	CODE	SOURCE
100	1	X	TARGET_TABLE
100	2	X	SOURCE_TABLE
100	3	Y	TARGET_TABLE
100	4	Z	TARGET_TABLE

Please advise.

Thanks!

=========================================================================================================================================

Update 1:

DDL and DML statements below:

--DDL source_table
CREATE TABLE source_table
(
    id NUMBER
    , line_number NUMBER
    , code VARCHAR2(3)
    --other columns omitted
    , entered_datetime DATE
    , CONSTRAINT pk_source_table PRIMARY KEY (id, line_number, code)
);
--DDL target_table
CREATE TABLE target_table
(
    id NUMBER
    , line_number NUMBER
    , code VARCHAR2(3)
    --other columns omitted
    , CONSTRAINT pk_target_table PRIMARY KEY (id, line_number, code)
);
--DML source_table
INSERT INTO source_table VALUES (10, 1, 'X', SYSDATE - 10);
INSERT INTO source_table VALUES (10, 2, 'X', SYSDATE - 10);
INSERT INTO source_table VALUES (10, 3, 'X', SYSDATE - 10);
INSERT INTO source_table VALUES (100, 1, 'X', SYSDATE + 1);
INSERT INTO source_table VALUES (100, 2, 'X', SYSDATE + 1);
INSERT INTO source_table VALUES (100, 3, 'Y', SYSDATE + 1);
INSERT INTO source_table VALUES (100, 4, 'Z', SYSDATE + 1);
INSERT INTO source_table VALUES (1000, 1, 'Z', SYSDATE + 1);
INSERT INTO source_table VALUES (1000, 2, 'Z', SYSDATE + 1);
--DML target_table
INSERT INTO target_table VALUES (10, 1, 'X');
INSERT INTO target_table VALUES (10, 2, 'X');
INSERT INTO target_table VALUES (10, 3, 'X');
INSERT INTO target_table VALUES (100, 1, 'X');
INSERT INTO target_table VALUES (100, 3, 'Y');
INSERT INTO target_table VALUES (100, 4, 'Z');
---------------------------------------
SELECT *
FROM source_table
ORDER BY id, line_number, code;
SELECT *
FROM target_table
ORDER BY id, line_number, code;
---------------------------------------
--This is what I currently have
SELECT id
     , line_number
     , code
     , 'SOURCE_TABLE' source
FROM source_table src
WHERE src.entered_datetime > SYSDATE
     AND NOT EXISTS (SELECT NULL
                    FROM target_table tgt
                    WHERE tgt.id = src.id
                         AND tgt.line_number = src.line_number
                         AND tgt.code = src.code)
UNION ALL
SELECT id
     , line_number
     , code
     , 'TARGET_TABLE' source
FROM target_table tgt
ORDER BY id, line_number, code;
--Optimal output will be
SELECT id
     , line_number
     , code
     , 'SOURCE_TABLE' source
FROM source_table src
WHERE src.entered_datetime > SYSDATE
     AND NOT EXISTS (SELECT NULL
                    FROM target_table tgt
                    WHERE tgt.id = src.id
                         AND tgt.line_number = src.line_number
                         AND tgt.code = src.code)
UNION ALL
SELECT id
     , line_number
     , code
     , 'TARGET_TABLE' source
FROM target_table tgt
WHERE tgt.id IN (100, 1000) --Since these are the only ids which came from the above source_table SELECT
ORDER BY id, line_number, code;

Current Output:

ID	LINE_NUMBER	CODE	SOURCE
10	1	X	TARGET_TABLE
10	2	X	TARGET_TABLE
10	3	X	TARGET_TABLE
100	1	X	TARGET_TABLE
100	2	X	SOURCE_TABLE
100	3	Y	TARGET_TABLE
100	4	Z	TARGET_TABLE
1000	1	Z	SOURCE_TABLE
1000	2	Z	SOURCE_TABLE

Optimal Output:

ID	LINE_NUMBER	CODE	SOURCE
100	1	X	TARGET_TABLE
100	2	X	SOURCE_TABLE
100	3	Y	TARGET_TABLE
100	4	Z	TARGET_TABLE
1000	1	Z	SOURCE_TABLE
1000	2	Z	SOURCE_TABLE

Message was edited by: 3683884

This post has been answered by mathguy on Sep 12 2020

Jump to Answer

Added on Sep 12 2020

#plsql, #select, #sql

17 comments

1,600 views