Skip to Main Content

Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

UTF-16 problem: Invalid byte 2 of 3-byte UTF-8 sequence.

843834Mar 30 2004 — edited May 8 2008
Hi all,

I am using Axis to talk to a MS SQL Server SOAP service that uses UTF-16 as the character encoding. The service is not great as it just returns a String of XML from the SOAP request. I then need to parse the String into a DOM. I am using the following to parse the XML into the DOM (where xml is the String returned)

DocumentBuilderFactory dBFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dBFactory.newDocumentBuilder();
Document dom = dBuilder.parse(new java.io.ByteArrayInputStream(xml.trim().getBytes()));

I used the following to generate the WSDL stubs etc

java -cp ./axis.jar;axis-ant.jar;commons-logging.jar;commons-discovery.jar;wsdl4j.jar;jaxrpc.jar;saaj.jar org.apache.axis.wsdl.WSDL2Java verbose timeout 300 http://atws.atdw.com.au/soap/AustralianTourismWebService.asmx?WSDL


Now the problem I am getting is "Invalid byte 2 of 3-byte UTF-8 sequence." which I am assuming relates to Axis pulling the document in in a UTF-8 format instead of UTF-16?

Have read that you can force Axis into UTF-16 mode... but don't seem to have managed this... could someone shed some light on how to do this or on what might be causing the problem.

Thanks in advance,
Alex.



Exception:

java.io.UTFDataFormatException: Invalid byte 2 of 3-byte UTF-8 sequence.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
at org.tempuri.soap.AustralianTourismWebService.TestBASE.parseDOM(TestBASE.java:113)
at org.tempuri.soap.AustralianTourismWebService.TestGetProduct.testPRODUCT_ID_9006992(TestGetProduct.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at org.tempuri.soap.AustralianTourismWebService.RunGetProduct.main(RunGetProduct.java:17)

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jun 5 2008
Added on Mar 30 2004
4 comments
2,182 views