DocumentBuilder - getting more information
843834Jun 17 2004 — edited Jun 18 2004Hi,
I'm current working on an application which uses the DocumentBuilder class to build a DOM Document from a String representation of the XML document.
However, when I run it, I get an exception as follows in the DocumentBuilder.parse method: java.io.UTFDataFormatException: Invalid byte 2 of 4-byte UTF-8 sequence (stack trace at the bottom).
There must be some a character in the XML document which is not a UTF-8 character, and I need to track this character down.
Is there any way I can hook into the document builder to see how far through the XML document it has got when the exception occurs?
Stack trace is as follows:
java.io.UTFDataFormatException: Invalid byte 2 of 4-byte UTF-8 sequence.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager$EntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager$EntityScanner.scanLiteral(Unknown Source)
at org.apache.xerces.impl.XMLScanner.scanAttributeValue(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanAttribute(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
at dbnexus.test.HierarchyExport.export(HierarchyExport.java:124)
Thanks!
Jim