UTF-16 problem: Invalid byte 2 of 3-byte UTF-8 sequence.
843834Mar 30 2004 — edited May 8 2008 Hi all,
I am using Axis to talk to a MS SQL Server SOAP service that uses UTF-16 as the character encoding. The service is not great as it just returns a String of XML from the SOAP request. I then need to parse the String into a DOM. I am using the following to parse the XML into the DOM (where xml is the String returned)
DocumentBuilderFactory dBFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dBFactory.newDocumentBuilder();
Document dom = dBuilder.parse(new java.io.ByteArrayInputStream(xml.trim().getBytes()));
I used the following to generate the WSDL stubs etc
java -cp ./axis.jar;axis-ant.jar;commons-logging.jar;commons-discovery.jar;wsdl4j.jar;jaxrpc.jar;saaj.jar org.apache.axis.wsdl.WSDL2Java verbose timeout 300 http://atws.atdw.com.au/soap/AustralianTourismWebService.asmx?WSDL
Now the problem I am getting is "Invalid byte 2 of 3-byte UTF-8 sequence." which I am assuming relates to Axis pulling the document in in a UTF-8 format instead of UTF-16?
Have read that you can force Axis into UTF-16 mode... but don't seem to have managed this... could someone shed some light on how to do this or on what might be causing the problem.
Thanks in advance,
Alex.
Exception:
java.io.UTFDataFormatException: Invalid byte 2 of 3-byte UTF-8 sequence.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
at org.tempuri.soap.AustralianTourismWebService.TestBASE.parseDOM(TestBASE.java:113)
at org.tempuri.soap.AustralianTourismWebService.TestGetProduct.testPRODUCT_ID_9006992(TestGetProduct.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at org.tempuri.soap.AustralianTourismWebService.RunGetProduct.main(RunGetProduct.java:17)