Skip to Main Content

Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

problem with encoding of xml document

843834May 7 2005 — edited Dec 31 2007
while parsing an xml document with SAX parser, i found that encoding of the xml document received as input stream is "ISO-8859-1" . After parsing certain fields has to be stored in the mysql table where table character set is "utf8" . Now what i found that ceratin characters in the original XML document are stored as question mark (?) in the database.


1. I am using mysql 4.1.7 with system variable character_set_database as "utf8". So all my tables have charset as "utf8".

2. I am parsing some xml file as inputsream using SAX parser api (org.apache.xerces.parsers.SAXParser ) with encoding "iso-8859-1". After parsing certain fields have to be stored in mysql database.

3. Some XML files contain a "iso-8859-1" character with character code 146 which appears like apostrophes but actually it is : - � and the problem is that words like can�t are shown as can?t by database.

4. I notiicied that parsing is going on well and character code is 146 while parsing. But when i reterive it from the database using jdbc it shows character code as 63.

5. I am using jdbc to prepared statement to insert parsed xml in the database. It seems that while inserting some problem occurs what is this i don't know.

6. I tried to convert iso-8859-1 to utf-8 before storing into database, by using
utfString = new String(isoString.getBytes("ISO-8859-1"),"UTF-8");
But still when i retreive it from the databse it shows caharcter code as 63.

7. I also tried to retrieve it using , description = new String(rs.getBytes(1),"UTF-8");
But it also shows that description contains character with code 63 instead of 146 and it is also showing can�t as can?t

help me out where is the problem in parsing or while storing and retreiving from database. Sorry for any spelling mistakes if any.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jan 28 2008
Added on May 7 2005
2 comments
292 views