Hi All,
I am trying to fetch some text from a web page that contains unicode data.
This is what I am doing:
1. opening HttpURLConnection on the target URL
2. setting some properties on httpURLConnection object, like : setDoOutput, setRequestProperty, setRequestMethod etc
3. get InputStream , create BufferedReader and reading text line by line (using BufferedReader.readLine)and even int by int (using BufferedReader.read)
I also tried providing various charsets while creating InputStreamReader:
BufferedReader br = new BufferedReader(new InputStreamReader(urlConn.getInputStream(), cs));
where cs = charset strings (UTF,UTF-8,UTF8,UTF-16,UTF16..etc)
The code then using the data to form a RSS feed XML for any RSS reader.
Observation is: all it reads well is only English characters, and all the Unicode characters gets messed up.
Any idea/or any other way to read
unicode charaters from the stream??
By the way, I am trying to read Devnagri/Hindi - Indian unicode text.