Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Flatten nested xml structure using xmltable

Nicolai ScheerFeb 10 2015 — edited Feb 12 2015

Hi!

I'm currently struggling to build a xmltable query which does the right thing and performs well.

The xml I'm trying to process is saved in a xmltype column and looks like this:

<customer>
    <name>Mustermann</name>
    <contract>
        <contract_id>C1</contract_id>
        <products>
            <product>
                <name>Product C1.P1</name>
                <price>23.12</price>
                <properties>
                    <property>Property C1.P1.A</property>
                    <property>Property C1.P1.B</property>
                </properties>
            </product>
            <product>
                <name>Product C1.P2</name>
                <price>2.32</price>
            </product>
        </products>
    </contract>
    <contract>
        <contract_id>C2</contract_id>
        <products>
            <product>
                <name>Product C2.P1</name>
                <price>143.33</price>
            </product>
            <product>
                <name>Product C2.P2</name>
                <price>231.76</price>
                <properties>
                    <property>Property C2.P2.A</property>
                    <property>Property C2.P2.B</property>
                </properties>
            </product>
        </products>
    </contract>
    <contract>
        <contract_id>C3</contract_id>
    </contract>
</customer>

The xml structure contains various repeating nodes most of which are optional. E.g. a contract does not need to have a product, a product does not need to have a property.

The desired output looks as follows:

customer_name	contract_id	product_name	product_price	product_property
Mustermann	C1	Product C1.P1	23.12	Property C1.P1.A
Mustermann	C1	Product C1.P1	23.12	Property C1.P1.A
Mustermann	C1	Product C1.P2	2.32	(null)
Mustermann	C2	Product C2.P1	143.33	(null)
Mustermann	C2	Product C2.P2	231.76	Property C2.P2.A
Mustermann	C2	Product C2.P2	231.76	Property C2.P2.B
Mustermann	C3	(null)	(null)	(null)

So basically the nested nodes are split up into separate rows as if I performed multiple left joins.

The Query I tried looks as follows:

create table t_xml_test_1 (exception_id number, xml_data xmltype);

insert into t_xml_test_1 values( 123456, xmltype(
'<customer>
    <name>Mustermann</name>
    <contract>
       <contract_id>C1</contract_id>
       <products>
           <product>
               <name>Product C1.P1</name>
               <price>23.12</price>
               <properties>
                  <property>Property C1.P1.A</property>
                  <property>Property C1.P1.B</property>
               </properties>
           </product>
           <product>
               <name>Product C1.P2</name>
               <price>2.32</price>
           </product>
       </products>
    </contract>
    <contract>
       <contract_id>C2</contract_id>
       <products>
           <product>
               <name>Product C2.P1</name>
               <price>143.33</price>
           </product>
           <product>
               <name>Product C2.P2</name>
               <price>231.76</price>
               <properties>
                  <property>Property C2.P2.A</property>
                  <property>Property C2.P2.B</property>
               </properties>
           </product>
       </products>
    </contract>
    <contract>
       <contract_id>C3</contract_id>
    </contract>
</customer>' )
);

SELECT 
  customer_name, contract_id, product_name, product_price, product_property
FROM 
  t_xml_test_1,
  XMLTable(  
'
for $g1 in /customer/contract
for $g2 in $g1/products/product
for $g3 in $g2/properties/property
return
<root>
  <g0>{/customer/name}</g0>
  <g1>{$g1/contract_id}</g1>
  <g2>{$g2/name}{$g2/price}</g2>
  <g3>{$g3}</g3>
</root>
' 
PASSING t_xml_test_1.xml_data
columns 
  customer_name CLOB path '/root/g0/name',
  contract_id CLOB path '/root/g1/contract_id', 
  product_name CLOB path '/root/g2/name', 
  product_price CLOB path '/root/g2/price',
  product_property CLOB path '/root/g3'
) xmltab;
  exception_id, customer_name, contract_id, product_name, product_price, product_property

This query performs quite well. The problem is, that the nested loop does not pick up the rows where a sub-node (e.g. $g2, $g3) does not exist.

To catch these rows as well, I added more loops:

SELECT 
  exception_id, customer_name, contract_id, product_name, product_price, product_property
FROM 
  t_xml_test_1,
  XMLTable(  
'
for $g1 in /customer/contract
for $g2 in $g1/products/product
for $g3 in $g2/properties/property
return
<root>
  <g0>{/customer/name}</g0>
  <g1>{$g1/contract_id}</g1>
  <g2>{$g2/name}{$g2/price}</g2>
  <g3>{$g3}</g3>
</root>,
for $g1 in /customer/contract
for  $g2 in $g1/products/product
  where empty($g2/properties/property)
return
<root>
  <g0>{/customer/name}</g0>
  <g1>{$g1/contract_id}</g1>
  <g2>{$g2/name}{$g2/price}</g2>
</root>,
for 
  $g1 in /customer/contract
  where empty($g1/products/product)
return
<root>
  <g0>{/customer/name}</g0>
  <g1>{$g1/contract_id}</g1>
</root>
' 
PASSING t_xml_test_1.xml_data
columns 
  customer_name CLOB path '/root/g0/name',
  contract_id CLOB path '/root/g1/contract_id', 
  product_name CLOB path '/root/g2/name', 
  product_price CLOB path '/root/g2/price',
  product_property CLOB path '/root/g3'
) xmltab;

This query works as expected, picking up all desired result rows.

Unfortunately this query performs very badly. The execution plan shows a "COLLECTION ITERATOR (PICKLER FETCH)" - I've learned that this should be avoided whenever possible.

My question: Is there any way to speed up this query or to use another technique to pick up those optional nodes? It seems quite ugly to concatenate multiple for loops.

We are running Oracle 11gR2 on RedHat 6.4 (Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production)

Any help is appreciated!

This post has been answered by odie_63 on Feb 10 2015

Jump to Answer

Locked Post

New comments cannot be posted to this locked post.

Locked on Mar 12 2015

Added on Feb 10 2015

#performance-availability, #xml, #xml-db

6 comments

8,541 views