I want to parse an XML document with a tag that contains a text compressed in a special way. The compression makes characters with unicode values of 20, 1 and even 0 appear. This drives xerces and crimson parsers mad. How can I solve this?
Below is the .xml and the .dtd. To reproduce the error run for example through DOMCounter in the xercesSamples.jar.
XML----------
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE IdAndTrajectory SYSTEM "file:/c:/Progress/dtd/IdAndTrajectory.dtd">
<IdAndTrajectory id="abc123">
<InvexorTrajectoryFile>DCL:0005
ABC123 ? Avtal1 ���K7?
7 </InvexorTrajectoryFile>
</IdAndTrajectory>
DTD----------
<!ELEMENT IdAndTrajectory (InvexorTrajectoryFile)>
<!ATTLIST IdAndTrajectory id CDATA #REQUIRED>
<!ELEMENT InvexorTrajectoryFile (#PCDATA)>
Thanks in advance to anyone who is able to help me out with this
/ulrik