Skip to Main Content

Java Programming

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Converting Chinese Characters from UTF-8 to GB2312

807580Sep 2 2010 — edited Sep 2 2010
Hi,

I need to interact with an external system that only accepts GB2312 encoded strings as input.
I have a site that is used to capture user input before feeding the data to the system. (Refer to the following)

<%
String strName = request.getParameter("strName");
boolean serviceStatus = false;

if (request.getParameter("strName") != null)
{
serviceStatus=invokeTheService(strName,"text_process");
}
%>
..
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
..

How can i encode the "strName" variable value to "GB2312". (Do be informed that i am unable to change the meta Content-Type to GB2312)

I had tried using the following but was unable get it right.

strName = new String(strName.getBytes("UTF-8"),"GB2312");

I had also tried using the CharsetEncoder.encode to attempt to encode it to GB2312 but kept getting a UnmappableCharacterException message.

*Correct me if i'm wrong, but UTF-8 tends to represent characters in 1,2 or 3 bytes.
In the case of chinese characters, each character is represented by 3 bytes.
GB2312 tends to represent each character in 2 bytes.
So if i have a 3 chinese character as input, the original strName.length() would return 9. whereas the Gb2312 encoded strName should return 6 ?




Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Sep 30 2010
Added on Sep 2 2010
4 comments
5,051 views