SQL*LOADER and XML
446616Jun 30 2005 — edited Jul 1 2005Hello,
my problem is that I have a XML-Doc packed as .gz and that I want to load into the Oracle Database in relational tables.
At the moment there is a cronjob starting a c-script to unpack and transform the XML-Dokument via Java XSLT to make it comma separated. This document is loaded via SQL*Loader in a staging table.
there is a second os-job using this data from the staging table.
This takes too much cpu-time.
I tried several things to load the Doc(original.xml) into a xmltype table. There is a DTD given in the Doc so that I have to remove this anyway because Oracle can't handle it.
My start is, that I have a rowbased Java-Programm that eliminates all the linefeeds, surrounding tags and the DTD.(nach_java.xml)
Then I can upload it with(test_xml.ctl) the Loader in seprate lines in the table. These lines are read by .extract(...)..
The thing is that there is no validation of the data :-(
Is this a way to go or should I use another, more performant way of importing the data?
Can I load the XML as lob?
FTP-Upload is not useful because of the bad, discard file feature of SQLLoad.
Thanks in advance for your help or experiences in a similar context.
Dropnose
Files:
<<original.xml>>
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?> <!DOCTYPE NN_RBZ SYSTEM "nn.dtd"> <NN_RBZ>
<!-- Buchung 1 --> <Buchung Status="OK"> <UET>12345</UET> <UE> <BLZ>3456</BLZ> <Name>Name1</Name> <Wert Waehrung="EUR">100,01</Wert> <Text>10</Text>
<Verwendung1>1234 Blabla</Verwendung1>
</UE>
<EPOS>
<Kassennr>4711</Kassennr>
<Kassetag>090405</Kassetag>
<Journalsatz>1234</Journalsatz>
</EPOS>
<DTA-Datum>14042005</DTA-Datum>
<Paginierung>96585858</Paginierung>
</Buchung>
<!-- Buchung 2 --> <Buchung Status="OK"> <UET>54321</UET> <UE> <BLZ>6543</BLZ> <Name>Name2</Name> <Wert Waehrung="EUR">56,70</Wert> <Text>51</Text>
<Verwendung1>Meyer Re 123</Verwendung1>
</UE> <EPOS> <Kassennr>0815</Kassennr> <Kassetag>090405</Kassetag> <Journalsatz>1235</Journalsatz>
</EPOS>
<DTA-Datum>14042005</DTA-Datum>
<Paginierung>141040213</Paginierung>
</Buchung>
</NN_RBZ>
<</original.xml>>
<<nach_java.xml>>
<Buchung Status="OK"><UET>12345</UET><UE><BLZ>3456</BLZ><Name>Name1</Name><Wert Waehrung="EUR">100,01</Wert><Text>10</Text><Verwendung1>1234 Blabla</Verwendung1></UE><EPOS><Kassennr>4711</Kassennr><Kassetag>090405</Kassetag><Journalsatz>1234</Journalsatz></EPOS><DTA-Datum>14042005</DTA-Datum><Paginierung>96585858</Paginierung></Buchung>
<Buchung Status="OK"><UET>54321</UET><UE><BLZ>6543</BLZ><Name>Name2</Name><Wert Waehrung="EUR">56,70</Wert><Text>51</Text><Verwendung1>Meyer Re 123</Verwendung1></UE><EPOS><Kassennr>0815</Kassennr><Kassetag>090405</Kassetag><Journalsatz>1235</Journalsatz></EPOS><DTA-Datum>14042005</DTA-Datum><Paginierung>141040213</Paginierung></Buchung>
<</nach_java>>
<<test_xml.ctl>>
OPTIONS (DIRECT=FALSE, READSIZE=500000)
LOAD DATA
INFILE *
INTO TABLE test_xml
APPEND
XMLTYPE (xmldata)
FIELDS
(xmldata CHAR(500000))
<</test_xml.ctl>>