Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Parsing XML string into DOM with Unicode entities fails on Linux only

843834Jan 31 2008 — edited Feb 1 2008

Hi all

I have a web controller receiving XML as a HTTP request parameter. The XML will look something like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<delta><invoice id="2112"><htmlfooter><P>Aucune vaccination n’est exig�e. en cas d'interdiction d'entr�e</htmlfooter></invoice></delta>

This string is parsed into a DOM using

DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new InputSource(new StringReader(xml)))

On my local Windows developement machine, this runs fine. On our Linux-Server however, the unicode entity ’ becomes a '?'. As the XML contains various Latin1 characters which are parsed correctly, I guess the XML encoding itself is ok.
I have debug statement first writing out the plain string, which look good. Then I have debug statement writing the getTextContent() of the XML nodes, there the n’est becomes n?est.

Any ideas where to look into?
Thanks for your help
Simon

Locked Post

New comments cannot be posted to this locked post.

Locked on Feb 29 2008

Added on Jan 31 2008

#java-technology-xml

42 comments

2,319 views