UTF-16 coding
807605Jul 2 2007 — edited Jul 2 2007hi,
I was writing a little program when I was wondering about a problem I got.
In the library I use a package called jid3lib -> for mp3 id tags, no problem with that, but when using:
artist in mp3 is called: "Aborted"
byte[] bytes = tagv2.getLeadArtist().getBytes();//("UTF-16");
--> propose this is UTF-16 codering which is legal and real in my case.
and I write something out:
System.out.println(new String(bytes));
System.out.println(new String(bytes, "UTF-16"));
byte[] b = ("Aborted").getBytes("UTF-16");
System.out.println(new String(b));
System.out.println(new String(b, "UTF-16"));
I get:
��A_b_o_r_t_e_d
Aborte?
for the mp3 tag and:
��_A_b_o_r_t_e_d
Aborted
for me own test.
ps: the "_" is a illigal character to display so I replaced it with this one... you know the rectangle one.
So you can see the problem, why the ? instead of a 'd' and how can I avoid this?
is there a common algoritm to decode all sorts of encodings like this? (f.e. remove all the bad characters?)
thx