XMLReader throws "Invalid UTF8 encoding." - Need parser for ISO-8859-1 chrs
Hi,
We are facing an issue when we try to send data which is encoded in "ISO-8859-1" charset (german chars) via the EMDClient (agent), which tries to parse it using the oracle.xml.parser.v2.XMLParser . The parser, while trying to read it, is unable to determine the charset encoding of our data and assumes that the encoding is "UTF-8", and when it tries to read it, throws the :
"java.io.UTFDataFormatException: Invalid UTF8 encoding." exception.
I looked at the XMLReader's code and found that it tries to read the first 4 bytes (Byte Order Mark - BOM) to determine the encoding. It is probably expecting us to send the data where the first line is probably:
<?xml version="1.0" encoding="iso88591" ?>
But, the data that our application sends is typically as below:
========================================================
# listener.ora Network Configuration File: /ade/vivsharm_emsa2/oracle/work/listener.ora
# Generated by Oracle configuration tools.
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = semsa2)
(ORACLE_HOME = /ade/vivsharm_emsa2/oracle)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = tcp)(HOST = stadm18.us.oracle.com)(PORT = 15100))
)
)
)
========================================================
the first 4 bytes in our case will be, int[] {35, 32, 108, 105} == chars {#, SPACE, l, i},
which does not match any of the encodings predefined in oracle.xml.parser.v2.XMLReader.pushXMLReader() method.
How do we ensure that the parser identifies the encoding properly and instantiates the correct parser for "ISO-8859-1"...
Should we just add the line <?xml version="1.0" encoding="iso88591" ?> at the beginning of our data?
We have tried constructing the inputstream (ByteArrayInputStream) by using String.getBytes("ISO-8859-1") and passing that to the parser, but that does not seem to work.
Please suggest.
Thanks & Regards,
Vivek.
PS: The exception we get is as below:
java.io.UTFDataFormatException: Invalid UTF8 encoding.
at oracle.xml.parser.v2.XMLUTF8Reader.checkUTF8Byte(XMLUTF8Reader.java:160)
at oracle.xml.parser.v2.XMLUTF8Reader.readUTF8Char(XMLUTF8Reader.java:187)
at oracle.xml.parser.v2.XMLUTF8Reader.fillBuffer(XMLUTF8Reader.java:120)
at oracle.xml.parser.v2.XMLByteReader.saveBuffer(XMLByteReader.java:450)
at oracle.xml.parser.v2.XMLReader.fillBuffer(XMLReader.java:2229)
at oracle.xml.parser.v2.XMLReader.tryRead(XMLReader.java:994)
at oracle.xml.parser.v2.XMLReader.scanXMLDecl(XMLReader.java:2788)
at oracle.xml.parser.v2.XMLReader.pushXMLReader(XMLReader.java:502)
at oracle.xml.parser.v2.XMLReader.pushXMLReader(XMLReader.java:205)
at oracle.xml.parser.v2.XMLParser.parse(XMLParser.java:180)
at org.xml.sax.helpers.ParserAdapter.parse(ParserAdapter.java:431)
at oracle.sysman.emSDK.emd.comm.RemoteOperationInputStream.readXML(RemoteOperationInputStream.java:363)
at oracle.sysman.emSDK.emd.comm.RemoteOperationInputStream.readHeader(RemoteOperationInputStream.java:195)
at oracle.sysman.emSDK.emd.comm.RemoteOperationInputStream.read(RemoteOperationInputStream.java:151)
at oracle.sysman.emSDK.emd.comm.EMDClient.remotePut(EMDClient.java:2075)
at oracle.sysman.emo.net.util.agent.Operation.saveFile(Operation.java:758)
at oracle.sysman.emo.net.common.WebIOHandler.saveFile(WebIOHandler.java:152)
at oracle.sysman.emo.net.common.BaseWebConfigContext.saveConfig(BaseWebConfigContext.java:505)