Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Flatten nested xml structure using xmltable

Nicolai ScheerFeb 10 2015 — edited Feb 12 2015

Hi!

I'm currently struggling to build a xmltable query which does the right thing and performs well.

The xml I'm trying to process is saved in a xmltype column and looks like this:

<customer>

    <name>Mustermann</name>

    <contract>

        <contract_id>C1</contract_id>

        <products>

            <product>

                <name>Product C1.P1</name>

                <price>23.12</price>

                <properties>

                    <property>Property C1.P1.A</property>

                    <property>Property C1.P1.B</property>

                </properties>

            </product>

            <product>

                <name>Product C1.P2</name>

                <price>2.32</price>

            </product>

        </products>

    </contract>

    <contract>

        <contract_id>C2</contract_id>

        <products>

            <product>

                <name>Product C2.P1</name>

                <price>143.33</price>

            </product>

            <product>

                <name>Product C2.P2</name>

                <price>231.76</price>

                <properties>

                    <property>Property C2.P2.A</property>

                    <property>Property C2.P2.B</property>

                </properties>

            </product>

        </products>

    </contract>

    <contract>

        <contract_id>C3</contract_id>

    </contract>

</customer>

The xml structure contains various repeating nodes most of which are optional. E.g. a contract does not need to have a product, a product does not need to have a property.

The desired output looks as follows:

customer_namecontract_idproduct_nameproduct_priceproduct_property
MustermannC1Product C1.P123.12Property C1.P1.A
MustermannC1Product C1.P123.12Property C1.P1.A
MustermannC1Product C1.P22.32(null)
MustermannC2Product C2.P1143.33(null)
MustermannC2Product C2.P2231.76Property C2.P2.A
MustermannC2Product C2.P2231.76Property C2.P2.B
MustermannC3(null)(null)(null)

So basically the nested nodes are split up into separate rows as if I performed multiple left joins.

The Query I tried looks as follows:

create table t_xml_test_1 (exception_id number, xml_data xmltype);

insert into t_xml_test_1 values( 123456, xmltype(

'<customer>

    <name>Mustermann</name>

    <contract>

       <contract_id>C1</contract_id>

       <products>

           <product>

               <name>Product C1.P1</name>

               <price>23.12</price>

               <properties>

                  <property>Property C1.P1.A</property>

                  <property>Property C1.P1.B</property>

               </properties>

           </product>

           <product>

               <name>Product C1.P2</name>

               <price>2.32</price>

           </product>

       </products>

    </contract>

    <contract>

       <contract_id>C2</contract_id>

       <products>

           <product>

               <name>Product C2.P1</name>

               <price>143.33</price>

           </product>

           <product>

               <name>Product C2.P2</name>

               <price>231.76</price>

               <properties>

                  <property>Property C2.P2.A</property>

                  <property>Property C2.P2.B</property>

               </properties>

           </product>

       </products>

    </contract>

    <contract>

       <contract_id>C3</contract_id>

    </contract>

</customer>' )

);

SELECT

  customer_name, contract_id, product_name, product_price, product_property

FROM

  t_xml_test_1,

  XMLTable( 

'

for $g1 in /customer/contract

for $g2 in $g1/products/product

for $g3 in $g2/properties/property

return

<root>

  <g0>{/customer/name}</g0>

  <g1>{$g1/contract_id}</g1>

  <g2>{$g2/name}{$g2/price}</g2>

  <g3>{$g3}</g3>

</root>

'

PASSING t_xml_test_1.xml_data

columns

  customer_name CLOB path '/root/g0/name',

  contract_id CLOB path '/root/g1/contract_id',

  product_name CLOB path '/root/g2/name',

  product_price CLOB path '/root/g2/price',

  product_property CLOB path '/root/g3'

) xmltab;

  exception_id, customer_name, contract_id, product_name, product_price, product_property

This query performs quite well. The problem is, that the nested loop does not pick up the rows where a sub-node (e.g. $g2, $g3) does not exist.

To catch these rows as well, I added more loops:

SELECT

  exception_id, customer_name, contract_id, product_name, product_price, product_property

FROM

  t_xml_test_1,

  XMLTable( 

'

for $g1 in /customer/contract

for $g2 in $g1/products/product

for $g3 in $g2/properties/property

return

<root>

  <g0>{/customer/name}</g0>

  <g1>{$g1/contract_id}</g1>

  <g2>{$g2/name}{$g2/price}</g2>

  <g3>{$g3}</g3>

</root>,

for $g1 in /customer/contract

for  $g2 in $g1/products/product

  where empty($g2/properties/property)

return

<root>

  <g0>{/customer/name}</g0>

  <g1>{$g1/contract_id}</g1>

  <g2>{$g2/name}{$g2/price}</g2>

</root>,

for

  $g1 in /customer/contract

  where empty($g1/products/product)

return

<root>

  <g0>{/customer/name}</g0>

  <g1>{$g1/contract_id}</g1>

</root>

'

PASSING t_xml_test_1.xml_data

columns

  customer_name CLOB path '/root/g0/name',

  contract_id CLOB path '/root/g1/contract_id',

  product_name CLOB path '/root/g2/name',

  product_price CLOB path '/root/g2/price',

  product_property CLOB path '/root/g3'

) xmltab;

This query works as expected, picking up all desired result rows.

Unfortunately this query performs very badly. The execution plan shows a "COLLECTION ITERATOR (PICKLER FETCH)" - I've learned that this should be avoided whenever possible.

My question: Is there any way to speed up this query or to use another technique to pick up those optional nodes? It seems quite ugly to concatenate multiple for loops.

We are running Oracle 11gR2 on RedHat 6.4 (Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production)

Any help is appreciated!

This post has been answered by odie_63 on Feb 10 2015
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Mar 12 2015
Added on Feb 10 2015
6 comments
8,470 views