Skip to Main Content

Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

ignoring invalid unicode characters when parsing an XML Stream

843834May 16 2006 — edited Mar 9 2008
I'm workingon an application that extrtacts large number of records from a database in form of XML data (you can think of each record as a separate XML file) and parses that data. Im using java 1.5.06 se jdk and use STAX for parsing. some portions of my data contain invalid XML characters:
") as the DamkAvhler number Da B"* 0. A numerical simulation of the integrod"
naturally as soon as the curser gets to an element which contains these characters i get the following error:
ParseError at [row,col]:[46,36]
Message: An invalid XML character (Unicode: 0x1b) was found in the element content of the document.
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[46,36]
how can i tell the parser to ignore these charcters and not read them at all? is there anything i can do to get as much of the data as possible without having to get the whole thing as a huge string and replace bad characters?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Apr 6 2008
Added on May 16 2006
6 comments
5,958 views