Skip to Main Content

Analytics Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Accented characters(ã,ä) causing special character while wrting in JavaUTF8

923468Apr 18 2012 — edited Apr 18 2012
Accented characters(ã,ä) in source XML are causing special characters(�) while wrting it in target xml using Java code with UTF8 encoding.

1. The source xml file is present with UTF-8 encoding.
2. While reading the file and writing it in UTF-8 format, some of the charaters (ã,ä) are not retained as expected, its written as '�'.
3. The reading and writing is performed by Java code.
4. The same Java code is working fine if the source and target xml are present in Windows server.
5. Java version used is Java 1.5.0


Source XML is like-

*<?xml version="1.0" encoding="UTF-8" ?>*
*<!DOCTYPE source_item SYSTEM "http://www.extranet.xyz.com/dtdi/promis/promis_313.dtd">*
*<source_item>*
*<record sequence_number="5386" creation_date="Fri Mar 30 11:48:09 2012" />*
*</PUB>*
*<PUB pubstyle="product" IDT="12345">*
*<PUBLDES>Gertrude Käsebier,Sebastião Salgado, Brazil</PUBLDES>*
*</PUB></source_item>*

On reading this file and writing it to another XML(with UTF-8 encoding), the words Käsebier, Sebastião are appearing as 'K�sebier,Sebasti�o'.

The java code used to read and write is as follows-

BufferedReader source = new BufferedReader(new InputStreamReader(new FileInputStream(sourceFile),"UTF-8"));
BufferedWriter target = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(targetFile),"UTF-8"));

This code works fine if the source and target xml are present in Windows. Whereas, if its present in the UNIX server its causing the above said issue.

Please provide your inputs in resolving this. Thanks!
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on May 16 2012
Added on Apr 18 2012
1 comment
1,275 views