Skip to Main Content

Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

How to parse an UTF-8 encoded XML message containing special characters?

843834Jun 20 2008 — edited Jun 30 2008
Hi all,

I've got a problem parsing UTF-8 encoded XML messages.

I'm using JUnitEE for testing. If an exception occurs during the test execution the exception messages of the failed tests are pasted into the XML string generated by JUnitEE.
These messages contain special characters (umlauts like �,�,� and so on). If I want to parse these XML messages a SAXParseException is thrown because of "illegal" characters inside the UTF-8 encoded XML.

Is there a way to avoid the SAXParseException to be thrown? How can I parse the XML messages without getting these encoding related exceptions? Do I have to "hack" the XML's encoding before parsing? Is this possible? How?
Unfortunately the encoding provided by the JUnitEE class XMLOutput.java is hardcoded and cannot be changed to ISO-8859-1 by setting a parameter which would solve my problem very nicely.

Changing all the special characters inside the application's messages is not a desired solution.

Your help is highly appreciated. Thanks.

Marc
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jul 28 2008
Added on Jun 20 2008
6 comments
812 views