Hi all,
I am writing in the forum first (because it could be that i am doing something wrong.... but i think it is a bug. Nonetheless, i thought i'd write my problem up here first.
I am using Java 6, and this has been reproduced on both windows and linux.
java version "1.6.0_03"
Problem:
read XML file into org.w3c.dom.Document.
XML File has some attributes which contain ampersand. These are escaped as (i think) is prescribed by the rule of XML. For example:
<?xml version="1.0" encoding="UTF-8"?>
<lang>
<text dna="8233" ro="chisturi de plex coroid (>=1.5 mm)" it="Cisti del plesso corioideo(>=1.5mm)" tr="Koroıd pleksus kisti (>=1.5 mm)" pt_br="Cisto do plexo coróide (>=1,5 mm)" de="Choroidplexus Zyste (>=1,5 mm)" el="Κύστεις χοροειδούς πλέγματος (>= 1.5 mm)" zh_cn="脉络膜囊肿(>= 1.5 mm)" pt="Quisto do plexo coroideu (>=1,5 mm)" bg="Киста на хориоидния плексус (>= 1.5 mm)" fr="Kystes du plexus choroide (>= 1,5 mm)" en="Choroid plexus cysts (>=1.5 mm)" ru="кисты сосудистых сплетений (>=1.5 mm)" es="Quiste del plexo coroideo (>=1.5 mm)" ja="脈絡膜嚢胞(>=1.5mm)" nl="Plexus choroidus cyste (>= 1,5 mm)" />
</lang>
As you might understand, we need to have the fixed text '>' for later processing. (not the greater than symbol '>' but the escaped version of it).
Therefore, I escape the ampersand (encode?) and leave the rest of the text as is. And so my > becomes >
All ok?
Symptom:
in fetching attributes, for example by the getAttribute("en") type call, the wrong attribute values are fetched.
Not only that, if i only read to Document instance, and write back to file, the attributes are shown mixed up.
eg:
dna: 8233, ro=chisturi de plex coroid (>=1.5 mm), en=кисты сосудистых сплетений (>=1, de=Choroidplexus Zyste (>=1,5 mm)
Here you can see that 'en' is shown holding what looks like greek, ... (what is ru as a country-code anyway?) where it should have obviously had the english text that originally was associated with the attribute 'en'
This seems very strange and unexpected to me. I would have thought that in escaping (encoding) the ampersand, i have fulfilled all requirements of me, and that should be that.
There is also no error that seems to occur.... we simply get the wrong order when fetching attributes.
Am I doing something wrong? Or is this a bug that should be submitted?
Kind Regards, and thanks to all responders/readers.
Sean
p.s. previously I had not been escaping the ampersand. This meant that I lost my ampersand in fetching attributes, AND the attribute order was ALSO WRONG!
In fact, the wrong order was what led me to read about how to correctly encode ampersand at all. I had been hoping that correctly encoding would fix the order problem, but it didn't.
Edited by: svaens on Mar 31, 2008 6:21 AM