SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Recursive query - get values from top or from below again

user13117585Oct 17 2022

Hello again all of you,
Yesterday, I received very good answers from Frank Kulash and mathguy for a question.
I'm so impressed about the possibilties of SQL language. I try to apply those suggestions to
my issue and I am facing another one. And as suggested by Frank Kulash yesterdat, I create then another topic.
Imagine the same structure as yesterday with a couple of additional fields. It's a tree structure with nodes. Root nodes are identified by parent_id is NULL.
Let's take the following sample data:

drop table nodes;
CREATE TABLE nodes 
(
  id number(10) NOT NULL, -- PK but we don't care 
  parent_id number(10),  -- NULLs are roots 
  from_ DATE,       -- a date that can never be null
  derived_from_ DATE,   -- min(from_) based on all descending nodes
  code varchar2(10)   -- just a code
);
delete nodes;
alter session set nls_date_format = 'YYYY';

-- insert some roots
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(1, NULL, TO_DATE('1999', 'YYYY'), NULL, 'AAA');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(2, NULL, TO_DATE('2000', 'YYYY'), TO_DATE('2000', 'YYYY'), 'B002');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(3, NULL, TO_DATE('2010', 'YYYY'), TO_DATE('2000', 'YYYY'), 'B003');

-- insert a hierachy for each root. ids are random in reality. here is only to make it easier to understand
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(10, 1, TO_DATE('2000', 'YYYY'), TO_DATE('2000', 'YYYY'), 'C010');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(11, 1, TO_DATE('2002', 'YYYY'), TO_DATE('2000', 'YYYY'), 'C011');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(12, 1, TO_DATE('2004', 'YYYY'), TO_DATE('2010', 'YYYY'), 'C012');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(121, 12, TO_DATE('2006', 'YYYY'), TO_DATE('2010', 'YYYY'), 'C0121');

WITH get_tree(id, parent_id, from_, derived_from_, code, lvl, full_from_) AS
(
 SELECT r.id, r.parent_id, r.from_, r.derived_from_, r.code, 1, 
     TO_CHAR(r.from_, 'YYYY') 
  FROM nodes r
  WHERE parent_id IS NULL
  UNION ALL
 SELECT n.id, n.parent_id, n.from_, n.derived_from_, n.code, t.lvl + 1, 
     t.full_from_ || ' > ' || TO_CHAR(n.from_, 'YYYY') 
  FROM get_tree t
  INNER JOIN nodes n ON t.id = n.parent_id
)
SEARCH DEPTH FIRST BY id SET rnk
SELECT * 
 FROM get_tree;


    ID PARENT_ID FROM_  DERIVED CODE       LVL FULL_FROM_              RNK
---------- ---------- ------- ------- ---------- ---------- ------------------------------ ----------
     1 <NULL>   1999  <NULL> AAA         1 1999                  1
    10     1 2000  2000  C010        2 1999 > 2000               2
    11     1 2002  2000  C011        2 1999 > 2002               3
    12     1 2004  2010  C012        2 1999 > 2004               4
    121     12 2006  2010  C0121        3 1999 > 2004 > 2006           5
     2 <NULL>   2000  2000  B002        1 2000                  6
     3 <NULL>   2010  2000  B003        1 2010                  7

7 rows selected.

In this case, when we have a look at rows with rnk from 1 to 5 they all belong to the root with id = 1.
The from_ date in the root is 1999. And all from_ dates from descending nodes have a bigger start date.
So, the derived from, should be set to 1999. Because it's the MIN(from_) we have in the whole hierarchy
under that root. At the same time, since it's that node, then the code, should remain the same.

The derived from below should not be taken in consideration. Whatever the value it has, it can be erased.
The only value that is important is from_. Imagine for ID 121 I had specify a derived from value of 1900,
it shouldn't impact the result.
Now, if I take another scenario:

-- insert a hierachy for each root. ids are random in reality. here is only to make it easier to understand
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(20,  2, TO_DATE('2000', 'YYYY'), TO_DATE('1050', 'YYYY'), 'R020');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(21,  2, TO_DATE('1999', 'YYYY'), TO_DATE('1900', 'YYYY'), 'R021');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(22,  2, TO_DATE('1998', 'YYYY'), TO_DATE('2050', 'YYYY'), 'R022');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(221, 22, TO_DATE('1997', 'YYYY'), TO_DATE('2210', 'YYYY'), 'R221');

And if I rexecute the same query and have a look at the from_ dates for root id = 2, I have this:



        ID  PARENT_ID FROM_   DERIVED CODE              LVL FULL_FROM_                            RNK
---------- ---------- ------- ------- ---------- ---------- ------------------------------ ----------
...
         2 <NULL>     2000    2000    B002                1 2000                                    6
        20          2 2000    1050    R020                2 2000 > 2000                             7
        21          2 1999    1900    R021                2 2000 > 1999                             8
        22          2 1998    2050    R022                2 2000 > 1998                             9
       221         22 1997    2210    R221                3 2000 > 1998 > 1997                     10
...

Here, again, we should forget the value that are already in derived. Here, we see that the oldest from_ is 1997 and it's the node with id = 221.
So, that value should be bring back to the top along with its code. Ideally, an output like this would be nice:

    ID      PARENT_ID FROM_  EXISTING_DERIVED EXISTING_CODE  NEW_DERIVED NEW_CODE        LVL FULL_FROM_                     RNK
---------- ---------- ------- ---------------- ------------- ----------- -------- ---------- ------------------------------ ----------
...
         2 <NULL>      2000               2000 B002                 1997 R221              1 2000                              6
        20           2 2000               1050 R020                 1997 R221              2 2000 > 2000                       7
        21           2 1999               1900 R021                 1997 R221              2 2000 > 1999                       8
        22           2 1998               2050 R022                 1997 R221              2 2000 > 1998                       9
       221          22 1997               2210 R221                 1997 R221              3 2000 > 1998 > 1997               10
...

Now, if I take another scenario:

-- insert a hierachy for each root. ids are random in reality. here is only to make it easier to understand
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(30,  3, TO_DATE('2000', 'YYYY'), TO_DATE('1050', 'YYYY'), 'Q030');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(31,  3, TO_DATE('1999', 'YYYY'), TO_DATE('1900', 'YYYY'), 'Q031');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(311, 31, TO_DATE('1995', 'YYYY'), TO_DATE('2210', 'YYYY'), 'Q331');

INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(32,  3, TO_DATE('1998', 'YYYY'), TO_DATE('2050', 'YYYY'), 'Q032');
INSERT INTO nodes(id, parent_id, from_, derived_from_, code) VALUES(321, 32, TO_DATE('1997', 'YYYY'), TO_DATE('2210', 'YYYY'), 'Q321');

And if I describe the result that I would like from that.

    ID      PARENT_ID FROM_   EXISTING_DERIVED EXISTING_CODE NEW_DERIVED NEW_CODE        LVL FULL_FROM_                            RNK
---------- ---------- ------- ---------------- ------------- ----------- -------- ---------- ------------------------------ ----------
... 
         3 <NULL>        2010             2000          B003        1995 Q331              1 2010                                  11
        30          3    2000             1050          Q030        1995 Q331              2 2010 > 2000                           12
        31          3    1999             1900          Q031        1995 Q331              2 2010 > 1999                           13
       311         31    1995             2210          Q331        1995 Q331              3 2010 > 1999 > 1995                    14
        32          3    1998             2050          Q032        1997 Q321              2 2010 > 1998                           15
       321         32    1997             2210          Q321        1997 Q321              3 2010 > 1998 > 1997                    16
...

Here we have two branches. The branch that is 32 has a child with a from date in 1997. So, it's has to be brought back to node with id = 32.
But, since node 311 has a lower date, we have to bring that one to the root.

I hope my wish is now clear? This is quite easy to make it in PL/SQL. A couple of loops and few and few selects. But, I really like to see how SQL helps to resolve this and it allows me to improve my knowledge. But, if you tell me that is no feasible, it's no issue.

Regards,

This post has been answered by mathguy on Oct 17 2022

Jump to Answer

Added on Oct 17 2022

#19-3-3, #sql

7 comments

1,937 views

1 mention