Skip to Main Content

Java Programming

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Encoding Problem - can't read UTF-8 file correctly

807589Aug 1 2008 — edited Aug 1 2008
Windows XP, JDK 7, same with JDK 6

I can't read a UTF-8 file correctly:


Content of File (utf-8, thai string):
เม็ดเลือดขาว

When opened in Editor and copy pasted to JTextField, characters are displayed correctly:

String text = jtf.getText();
text.getBytes("utf-8");
-32 -71 -128 -32 -72 -95 -32 -71 -121 -32 -72 -108 -32 -71 -128 -32 -72 -91 -32 -72 -73 -32 -72 -83 -32 -72 -108 -32 -72 -126 -32 -72 -78 -32 -72 -89

Read file with FileReader/BufferedReader:
line = br.readLine();

buffs = line.getBytes("utf-8"); //get bytes with UTF-8 encoding
-61 -65 -61 -66 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14

buffs = line.getBytes(); // get bytes with default encoding
-1 -2 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14

Read file with:
FileInputStream fis...
InputStreamReader isr = new InputStreamReader(fis,"utf-8");
BufferedReader brx = new BufferedReader(isr);
line = br.readLine();

buffs = line.getBytes("utf-8");
-17 -65 -67 -17 -65 -67 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14


buffs = line.getBytes();
63 63 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14


Anybody has an idea? The file seems to be UTF-8 encoded. What could be wrong here?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Aug 29 2008
Added on Aug 1 2008
12 comments
2,998 views