Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

generic column lineage for Oracle

Alex R5 days ago — edited 5 days ago

Pretty much any person who works with data faces a task to trace attributes from source to target (also referred as data lineage).

In this post I will describe how that can be done for a given query.

It’s been known for quite a while that it is possible to parse a query and get result in XML format.

Philipp wrote about that 15 years ago.

However, if we also use dbms_utility.expand_sql_text then approach becomes much more flexible and generic.

Below is a quick demo.

create table tt1(a1, b1, c1) as select 1, 1, 1 from dual;

create table tt2(a2, b2, c2) as select 1, 1, 1 from dual;

create table tt3(a3, b3, c3) as select 1, 1, 1 from dual;

create table tt4(a4, b4, c4) as select 1, 1, 1 from dual;

create or replace view vvv as
select (x + c3) * tt3.a3 x1, t0.l - sign(b3) + (select max(c4+a4) from tt4) x2
  from (select tt1.b1 + (tt1.a1 + 1) * tt2.c2 x,
               length(substr('dummy', tt1.b1, tt2.b2)) l
          from tt1
          join tt2
            on tt1.a1 = tt2.a2) t0
 cross join tt3;

Now we can get all source columns and relevant tables which contribute to target columns by traversing parsed XML.

SQL> with t as
  2  (
  3  select col.*, dst_col, is_top
  4  from xmltable('//SELECT_LIST_ITEM'
  5                passing parse_query(expand_query('select * from vvv'))
  6                columns
  7                  dst_col varchar2(30) path 'COLUMN_ALIAS',
  8                  is_top  varchar2(30) path '1-count(ancestor::*[4])',
  9                  orig    xmltype      path '.'
 10       ) sel,
 11       xmltable('//COLUMN_REF'
 12                passing sel.orig
 13                columns
 14                  schema  varchar2(30) path 'SCHEMA',
 15                  tbl     varchar2(30) path 'TABLE',
 16                  alias   varchar2(30) path 'TABLE_ALIAS',
 17                  src_col varchar2(30) path 'COLUMN'
 18       ) col
 19  ),
 20  r(lvl, schema, tbl, alias, src_col, dst_col, root_col) as
 21  (
 22      select 1, t.schema, t.tbl, t.alias, t.src_col, t.dst_col, t.dst_col root_col
 23      from t where is_top = 1
 24      union all
 25      select lvl+1, t.schema, t.tbl, t.alias, t.src_col, t.dst_col, r.root_col root_col
 26      from t, r
 27      where r.src_col = t.dst_col and r.alias <> t.alias
 28  )
 29  select tbl, src_col, root_col dst_col
 30  from r
 31  where tbl is not null
 32  order by 1, 2, 3;

TBL                  SRC_COL              DST_COL
-------------------- -------------------- --------------------
TT1                  A1                   X1
TT1                  B1                   X1
TT1                  B1                   X2
TT2                  B2                   X2
TT2                  C2                   X1
TT3                  A3                   X1
TT3                  B3                   X2
TT3                  C3                   X1
TT4                  A4                   X2
TT4                  C4                   X2

10 rows selected.

You can get all the source here.

PS. You do not want to mix connect by and xmltable in the same query because of this.

Comments
Post Details
Added 5 days ago
15 comments
167 views