XMLType column based on XML Schema: several questions
flaviocDec 12 2005 — edited Dec 29 2005Hi,
I've a table on an oracle db version 10.1.0.4 where I stage the xml files containing orders created on a third party's system using BizTalk.
Although the storage I opted for is based on an XML Schema, defined by this third-party, I am facing big perfomance issues with files bigger than a few hundreds of kBs.
For instance, a 32Mb file takes more than 2 hours to be processed.
Now, after reading other threads in this forum and the documentation, my understanding of the problem is that the whole issue is with the correct indexing of the nested tables.
Here is my current XML Schema definition:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xdb="http://xmlns.oracle.com/xdb"
elementFormDefault="qualified"
attributeFormDefault="unqualified"
xdb:storeVarrayAsTable="true">
<xs:include schemaLocation="private/Types.xsd"/>
<xs:element name="PickData">
<xs:complexType xdb:maintainDOM="false">
<xs:sequence>
<xs:element name="ProdRun">
<xs:complexType xdb:maintainDOM="false">
<xs:sequence>
<xs:element name="Nr" type="xs:int"/>
<xs:element name="Date" type="string8"/>
<xs:element name="Final" type="xs:int"/>
<xs:element name="PickWave" maxOccurs="unbounded">
<xs:complexType xdb:maintainDOM="false">
<xs:sequence>
<xs:element name="Nr" type="string10"/>
<xs:element name="ProdLine" type="string2"/>
<xs:element name="TourSeq" type="xs:int"/>
<xs:element name="Tour" type="string20"/>
<xs:element name="Customer" maxOccurs="unbounded">
<xs:complexType xdb:maintainDOM="false">
<xs:sequence>
<xs:element name="Seq" type="string20"/>
<xs:element name="Cust" type="string10"/>
<xs:element name="Mod" type="string30"/>
<xs:element name="Tod" type="string30"/>
<xs:element name="InvOrder" maxOccurs="unbounded">
<xs:complexType xdb:maintainDOM="false">
<xs:sequence>
<xs:element name="Nr" type="string20"/>
<xs:element name="Item" type="string20"/>
<xs:element name="Qty" type="xs:int"/>
<xs:element name="Priority" type="xs:int"/>
<xs:element name="Reordering" type="xs:int"/>
<xs:element name="DelDate" type="string8"/>
<xs:element name="HlOrder" type="string20"/>
</xs:sequence>
</xs:complexType>
<xs:unique name="InvOrderKey">
<xs:selector xpath="InvOrder"/>
<xs:field xpath="Nr"/>
</xs:unique>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:unique name="CustomerKey">
<xs:selector xpath="Customer"/>
<xs:field xpath="Seq"/>
</xs:unique>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:unique name="PickWaveKey">
<xs:selector xpath="PickWave"/>
<xs:field xpath="Nr"/>
</xs:unique>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:unique name="ProdRunKey">
<xs:selector xpath="ProdRun"/>
<xs:field xpath="Nr"/>
</xs:unique>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Here is the included sub-schema:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:simpleType name="string2">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="2"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string5">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="5"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string6">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="6"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string8">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="8"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string10">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="10"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string15">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="15"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string20">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="20"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string30">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="30"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string40">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="40"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string50">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="50"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="string250">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="250"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
The statement for creating my table is
CREATE TABLE "XML_ORDERS"
("ID" NUMBER(7,0) NOT NULL ENABLE,
"XMLFILE" "SYS"."XMLTYPE" ,
"INSERTED" DATE DEFAULT sysdate,
CONSTRAINT "XML_ORDERS_PK" PRIMARY KEY ("ID") USING INDEX ENABLE
) XMLTYPE COLUMN XMLFILE STORE AS OBJECT RELATIONAL
XMLSCHEMA "private/PickData.xsd" ELEMENT "PickData"
Here is a simple instance document:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<PickData xsi:noNamespaceSchemaLocation="private/PickData.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ProdRun>
<Nr>5</Nr>
<Date>15112005</Date>
<Final>1</Final>
<PickWave>
<Nr>IPW0000017</Nr>
<ProdLine>01</ProdLine>
<TourSeq>1</TourSeq>
<Tour>00000043_078</Tour>
<Customer>
<Seq>5</Seq>
<Cust>100000006</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000457</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000742</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000459</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000742</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>6</Seq>
<Cust>100000013</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000461</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000743</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000463</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000743</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>2</Seq>
<Cust>100000114</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000465</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000744</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000467</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000744</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>3</Seq>
<Cust>100000140</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000469</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000745</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000471</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000745</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>7</Seq>
<Cust>100000143</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000473</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000746</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000475</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000746</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>8</Seq>
<Cust>100000145</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000477</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000747</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000479</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000747</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>9</Seq>
<Cust>100000146</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000481</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>0</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000748</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000483</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000748</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>4</Seq>
<Cust>100000147</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000485</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>0</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000750</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000487</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000750</HlOrder>
</InvOrder>
</Customer>
<Customer>
<Seq>10</Seq>
<Cust>100000148</Cust>
<Mod>FO</Mod>
<Tod>DDU</Tod>
<InvOrder>
<Nr>IIO0000489</Nr>
<Item>100000036</Item>
<Qty>20</Qty>
<Priority>0</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000751</HlOrder>
</InvOrder>
<InvOrder>
<Nr>IIO0000491</Nr>
<Item>100000045</Item>
<Qty>20</Qty>
<Priority>1</Priority>
<Reordering>0</Reordering>
<DelDate>15112005</DelDate>
<HlOrder>CSO000751</HlOrder>
</InvOrder>
</Customer>
</PickWave>
</ProdRun>
</PickData>
When I registered the XMLSchema, the following types and tables were automatically created and you can see the hierarchy below:
(by the way, I could not find any xdb_utilities.printNestedTables mentioned elsewhere)
XML_ORDERS
|_PickData381_T
|___ProdRun382_T
|_____PickWave388_COLL
|_______PickWave383_T
|_________Customer387_COLL
|___________Customer384_T
|_____________InvOrder386_COLL
These objects are then used in the following nested tables:
TABLE_NAME TABLE_TYPE_NAME PARENT_TABLE_NAME PARENT_TABLE_COLUMN
SYS_NTaK/5zar5S0WitSsgu6OKPQ== PickWave388_COLL PickData389_TAB "XMLDATA"."ProdRun"."PickWave"
SYS_NTf6QvwVm8SFKz+K/YYWq+WQ== Item408_COLL ProdData409_TAB "XMLDATA"."Item"
SYS_NTtu05ilrRQqmuEN4k+07VDA== Customer402_COLL OutboundParty403_TAB "XMLDATA"."Customer"
SYS_NTK6fhWq5uTJ+vKcgBpNm1Fg== InvOrder386_COLL SYS_NTIIzv7bkXQSSS43igtfi5eg== InvOrder
SYS_NTIIzv7bkXQSSS43igtfi5eg== Customer387_COLL SYS_NTaK/5zar5S0WitSsgu6OKPQ== Customer
I enabled sql tracing and I got the following TKPROF output
********************************************************************************
INSERT INTO IMP_ORDERS (PICK_INVORDERNR, PICK_ITEM, PICK_QTY, PICK_PRIORITY,
PICK_REORDERING, PICK_HLORDER, PICK_DELDATE, PICK_CUST, PICK_MOD, PICK_TOD,
PICK_SEQ, PICK_PICKWAVENR, PICK_PICKWAVEPRODLINE, PICK_PICKWAVETOUR,
PICK_PICKWAVETOURSEQ, PICK_ORDKEY, PICK_RUNKEY) SELECT INVORDERNR, ITEM,
QTY, PRIORITY, REORDERING, HLORDER, DELDATE, CUST, MOD, TOD, SEQ,
PICKWAVENR, PICKWAVEPRODLINE, PICKWAVETOUR, PICKWAVETOURSEQ, ROWNUM AS
PICK_ORDKEY, PRODRUNID FROM (SELECT /*+ cardinality(g 15)*/
EXTRACTVALUE(VALUE(G), '/InvOrder/Nr') AS INVORDERNR, EXTRACTVALUE(VALUE(G),
'/InvOrder/Item') AS ITEM, EXTRACTVALUE(VALUE(G), '/InvOrder/Qty') AS QTY,
EXTRACTVALUE(VALUE(G), '/InvOrder/Priority') AS PRIORITY,
EXTRACTVALUE(VALUE(G), '/InvOrder/Reordering') AS REORDERING,
EXTRACTVALUE(VALUE(G), '/InvOrder/HlOrder') AS HLORDER,
TO_DATE(EXTRACTVALUE(VALUE(G), '/InvOrder/DelDate'),'DDMMYYYY') AS DELDATE,
F.CUST, F.MOD, F.TOD, F.SEQ, F.PICKWAVENR, F.PICKWAVEPRODLINE,
F.PICKWAVETOUR, F.PICKWAVETOURSEQ, F.PRODRUNNR, F.PRODRUNDATE,
F.PRODRUNFINAL, F.PRODRUNID FROM (SELECT /*+ cardinality(e 60)*/VALUE(E) AS
CUSTOMERNODE, EXTRACTVALUE(VALUE(E), '/Customer/Cust') AS CUST,
EXTRACTVALUE(VALUE(E), '/Customer/Mod') AS MOD, EXTRACTVALUE(VALUE(E),
'/Customer/Tod') AS TOD, TO_NUMBER(EXTRACTVALUE(VALUE(E), '/Customer/Seq'))
AS SEQ, D.PICKWAVENR, D.PICKWAVEPRODLINE, D.PICKWAVETOUR, D.PICKWAVETOURSEQ,
D.PRODRUNNR, D.PRODRUNDATE, D.PRODRUNFINAL, D.PRODRUNID FROM (SELECT /*+
cardinality(c 100)*/VALUE(C) AS PICKWAVENODE, EXTRACTVALUE(VALUE(C),
'/PickWave/Nr') AS PICKWAVENR, TO_NUMBER(EXTRACTVALUE(VALUE(C),
'/PickWave/ProdLine')) AS PICKWAVEPRODLINE, EXTRACTVALUE(VALUE(C),
'/PickWave/Tour') AS PICKWAVETOUR, TO_NUMBER(EXTRACTVALUE(VALUE(C),
'/PickWave/TourSeq')) AS PICKWAVETOURSEQ, A.PRODRUNNR, A.PRODRUNDATE,
A.PRODRUNFINAL, A.PRODRUNID FROM (SELECT /*+ cardinality(b 1)*/VALUE(B) AS
PRODRUNNODE, EXTRACTVALUE(VALUE(B), '/ProdRun/Nr') AS PRODRUNNR,
TO_DATE(EXTRACTVALUE(VALUE(B), '/ProdRun/Date'),'DDMMYYYY') AS PRODRUNDATE,
EXTRACTVALUE(VALUE(B), '/ProdRun/Final') AS PRODRUNFINAL, X.ID PRODRUNID
FROM XML_ORDERS X, TABLE(XMLSEQUENCE(EXTRACT(X.XMLFILE,'/PickData/ProdRun'))
) B WHERE X.ID = :B1 ) A, TABLE(XMLSEQUENCE(EXTRACT(A.PRODRUNNODE,
'/ProdRun/PickWave'))) C ) D, TABLE(XMLSEQUENCE(EXTRACT(D.PICKWAVENODE,
'/PickWave/Customer'))) E ) F, TABLE(XMLSEQUENCE(EXTRACT(F.CUSTOMERNODE,
'/Customer/InvOrder'))) G ORDER BY PICKWAVEPRODLINE, PICKWAVETOURSEQ,
PICKWAVENR, SEQ )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 4324.09 9994.65 0 57193 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 4324.09 9994.65 0 57193 0 0
Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 68 (recursive depth: 1)
Rows Row Source Operation
------- ---------------------------------------------------
0 COUNT (cr=0 pr=0 pw=0 time=180 us)
0 VIEW (cr=0 pr=0 pw=0 time=166 us)
0 SORT ORDER BY (cr=0 pr=0 pw=0 time=152 us)
40866 NESTED LOOPS (cr=54973 pr=0 pw=0 time=31065606 us)
1363 NESTED LOOPS (cr=54937 pr=0 pw=0 time=11037183 us)
1 NESTED LOOPS (cr=54889 pr=0 pw=0 time=10145883 us)
1 NESTED LOOPS (cr=54841 pr=0 pw=0 time=9799012 us)
1 TABLE ACCESS BY INDEX ROWID XML_ORDERS (cr=2 pr=0 pw=0 time=222 us)
1 INDEX UNIQUE SCAN XML_ORDERS_PK (cr=1 pr=0 pw=0 time=126 us)(object id 58551)
1 COLLECTION ITERATOR PICKLER FETCH (cr=54839 pr=0 pw=0 time=9798748 us)
1 COLLECTION ITERATOR PICKLER FETCH (cr=48 pr=0 pw=0 time=346818 us)
1363 COLLECTION ITERATOR PICKLER FETCH (cr=48 pr=0 pw=0 time=870830 us)
40866 COLLECTION ITERATOR PICKLER FETCH (cr=36 pr=0 pw=0 time=18739302 us)
Note that I cancelled this operation before it was over so I imagine that these figures refer to the statistics as of the time when the operation was interrupted.
So, here are finally my questions.
In order to create the constraints on the nested tables as shown in other threads, do I need to drop the existing xml_orders table and ancillary object types and recreate them or is there a way to add such constraints using the existing system generated object names?
Secondly, the xml_orders table may contain severale documents, not just one and his current primary key is the column ID. So, in order to uniquely identify the deepest element in the xml document, I need first to select the relevant document by means of the id column.
Would it be better to create the indexes containing this id column together with the nested_table_id and array_index?
Thanks for you help.
Flavio
PS: I wrote a 10 lines xsl transformation that I passed on to Saxon together with the 32Mb file. It took less than 1 minute to produce a flat file that was loaded almost instantly by SQL*Loader. So, what I am looking for is a procedure loading this stuff in less than 2 minutes or possibly less.