Hi,
my question is on finding the lowest common ancestor (LCA) of some given nodes of a tree (graph with no cycles and one root) on 12.1.2.
Setup below:
with regions (c, p) as (
select 0, null from dual union all
select 1, 0 from dual union all
select 2, 1 from dual union all
select 3, 2 from dual union all
select 4, 3 from dual union all
select 5, 1 from dual union all
select 6, 5 from dual union all
select 7, 0 from dual union all
select 8, 7 from dual
)
, combos (n, c) as (
select 1, 3 from dual union all
select 1, 4 from dual union all
select 2, 3 from dual union all
select 3, 3 from dual union all
select 3, 4 from dual union all
select 3, 5 from dual union all
select 4, 2 from dual union all
select 4, 8 from dual union all
select 5, 6 from dual
)
select c.n, r.c, r.pth
from (
select c, p
, sys_connect_by_path (c, ',') pth
from regions r
start with p is null
connect by prior c = p
) r
, combos c
where r.c = c.c
order by c.n, r.c
The combination (n, c) is unique.
The resulting paths are shown below
N | C | PTH |
1 | 3 | ,0,1,2,3 |
1 | 4 | ,0,1,2,3,4 |
2 | 3 | ,0,1,2,3 |
3 | 3 | ,0,1,2,3 |
3 | 4 | ,0,1,2,3,4 |
3 | 5 | ,0,1,5 |
4 | 2 | ,0,1,2 |
4 | 8 | ,0,7,8 |
5 | 6 | ,0,1,5,6 |
From that one can find the LCA for example for n = 1 => 3 or for n=4 => 0.
I came up with the below that gives the desired result:
select n
,listagg(c, ', ') within group (order by c) cc
,case when count (distinct c) = 1
then max (to_char(c))
else region
end c
,region
from (
select n
,c
,rtrim(substr(pth, instr(pth, ',', -1, 2) + 1 ),',') region
from (
select c.n
,r.c
,min(lev - case when isleaf = 1 then 1 else 0 end) over (partition by c.n)
mlev
,case when isleaf = 1 then substr(pth, 1, instr(pth, ',', -1, 2))
else pth
end pth
from (
select c
,sys_connect_by_path (c, ',')||',' pth
,level lev
,connect_by_isleaf isleaf
from regions r
start with p is null
connect by prior c = p
) r
, combos c
where r.c = c.c
)
model
partition by (n)
dimension by (c)
measures (substr(pth, 1, instr(pth,',', 1, mlev + 1)) as pth
, count(distinct pth)over(partition by n) as cnt)
rules sequential order
iterate(10) ( -- assuming 10+1 is the deepest level
pth[any] = case when cnt[cv()] > 1
then substr(pth[cv()], 1, instr(pth[cv()], ',' , -1, 2))
else pth[cv()]
end
,cnt[any]=count(distinct pth)[any]
)
)
group by n, region
N | CC | C | REGION |
1 | 3, 4 | 3 | 3 |
2 | 3 | 3 | 3 |
3 | 3, 4, 5 | 1 | 1 |
4 | 2, 8 | 0 | 0 |
5 | 6 | 6 | 5 |
Column region shows the LCA.
Column cc shows the nodes the LCA was calculated for.
Column c shows the desired outcome, that is additionally in the case of only one node in column c the node itself should be taken instead of its ancestor (e. g. n=5).
I am not glad with my solution to say the least, but i somehow got stucked, since it seems to work.
Perhaps someone has an idea how to improve it (shorten, get rid of the model-clause, readibility, correctness ...) since this seems to be a common graph theory problem as i found out afterwards ;-)
I would prefer to keep it in SQL rather than switching to pl/sql.
Thanks and regards
Chris