Encoding Problem - can't read UTF-8 file correctly
807589Aug 1 2008 — edited Aug 1 2008Windows XP, JDK 7, same with JDK 6
I can't read a UTF-8 file correctly:
Content of File (utf-8, thai string):
เม็ดเลือดขาว
When opened in Editor and copy pasted to JTextField, characters are displayed correctly:
String text = jtf.getText();
text.getBytes("utf-8");
-32 -71 -128 -32 -72 -95 -32 -71 -121 -32 -72 -108 -32 -71 -128 -32 -72 -91 -32 -72 -73 -32 -72 -83 -32 -72 -108 -32 -72 -126 -32 -72 -78 -32 -72 -89
Read file with FileReader/BufferedReader:
line = br.readLine();
buffs = line.getBytes("utf-8"); //get bytes with UTF-8 encoding
-61 -65 -61 -66 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14
buffs = line.getBytes(); // get bytes with default encoding
-1 -2 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14
Read file with:
FileInputStream fis...
InputStreamReader isr = new InputStreamReader(fis,"utf-8");
BufferedReader brx = new BufferedReader(isr);
line = br.readLine();
buffs = line.getBytes("utf-8");
-17 -65 -67 -17 -65 -67 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14
buffs = line.getBytes();
63 63 32 0 64 14 33 14 71 14 20 14 64 14 37 14 55 14 45 14 20 14 2 14 50 14 39 14
Anybody has an idea? The file seems to be UTF-8 encoded. What could be wrong here?