dbms_xmlparser performance issues and/or known limitations
Hello,
I inherited PL/SQL code that relies on xml_parser to build a dom representation of XML files uploaded as CLOBs. It then collects values and attributes and turns them into records to be inserted as regular db rows.
Initially the code was developed (I think) under 9i. I found some notes online that seemed to acknowledge limitations with the underlying java parser at the time, and said future versions would use C++ Xerces under the cover. Currently we're at 10.2 level, and I see xmlparser and dbms_xmlparser are both synonyms to the XDB.dbms_xmlparser parser. Fine - except the performance issues are apparently still there.
In practice I already split my input XML into many smaller files, some of which are still several megabytes in length. 2MB completes albeit in a very long time, 8+MB doesn't in over 8h. I am not getting any exception, the parser apparently still eats CPU like mad, yet doesn't return a domdocument in any reasonable amount of time. I tried to call setErrorlog() and showWarning() to get more details, but these are simply not supported!
Granted, this is not a very powerful machine (HP-UX PA), but it is a test db instance with little other activity. Are there known caveats/work-arounds/best practices when using the parser exposed from PL/SQL? Are there alternative implementations that work better? What are people using? Using DOM to parse big documents was asking for trouble in the 1st place, but OTOH less than 10MB isn't that big nowadays, come on!
Any hint will be appreciated, thank you.
Bernard.