hello,
input file contains 3 characters: 123
calling the below method with the above input file produces output file containing 12ÿý
loop iterates twice.
input file contains 4 characters: 1234
calling the below method with the above input file produces output file containing 1234
loop iterates twice.
in general, if the number of non-carriage-return (ncr), non-line-feed (nlf) bytes in the input file is odd then the input and output files differ in the last character.
ÿý == U+FFFD ???
it must be that using encoding UTF-16BE, 2 ncr,nlf bytes are read at a time (to form a single unicode char), and any extra byte is paired with FFFD.
how do i remedy this and still use UTF-16BE?
thank you!
public void test(String input, String output)
{
try
{
FileInputStream fistream = new FileInputStream(input);
InputStreamReader istreamReader = new InputStreamReader(fistream, "UnicodeBigUnmarked");
BufferedReader reader = new BufferedReader(istreamReader);
FileOutputStream fostream = new FileOutputStream(output, false);
BufferedOutputStream bostream = new BufferedOutputStream(fostream);
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(bostream, "UnicodeBigUnmarked"));
try
{
int chr;
while((chr = reader.read()) != -1)
{
echo((char) chr + " == " + chr + " == " + chr);
writer.write(chr);
}
}
finally
{
reader.close();
writer.close();
}
}
catch(IOException iox)
{
echo("test(): exceptions:" + iox.getMessage());
}
}