HOWTO: Extract XML from Oracle 10g with UTF-8 encoding
716619Dec 9 2009 — edited Dec 9 2009Hi,
my backend DB is Oracle 10.2.0.4 on a Win2003 server. As we are in germany the database uses GERMAN_GERMANY.WE8ISO8859P1 as NLS_LANGUAGE setting; the clients use the same setting
I have a client app that needs to retrieve xml data from the DB a store this in user specified file using UTF-8 encoding. I prepared a small test case to get XML containing some german umlaut charcaters:
-----
drop table TMP_TEST_XML;
create table TMP_TEST_XML(colxml xmltype);
insert into TMP_TEST_XML
select XMLRoot(
XMLElement("dummy",
XMLAttributes(
DUMMY as "value",
'aöüßÄÖÜ€' as "test"
)
), version '1.0'
)
from DUAL;
insert into TMP_TEST_XML
select XMLRoot(
XMLElement("dummy",
XMLAttributes(
DUMMY as "value",
'aöüßÄÖÜ€' as "test"
)
), version '1.0" encoding="ISO-8859-1'
)
from DUAL;
commit;
-----
This creates a table with an xmltype column with two rows containing
<?xml version="1.0"?>
<dummy value="X" test="aöüßÄÖÜ€"/>
and
<?xml version="1.0" encoding="ISO-8859-1"?>
<dummy value="X" test="aöüßÄÖÜ€"/>
-----
Now I start a SQLPlus from a DOS box but first switch the client character set to UTF-8 using:
set NLS_LANG=GERMAN_GERMANY.AL32UTF8
sqlplusw.exe usr/pwd@db
and do a select of my xml data:
SQL> select * from TMP_TEST_XML;
COLXML
--------------------------------------------------------
<?xml version="1.0"?>
<dummy value="X" test="aöüßÄÖ܀"/>
<?xml version="1.0" encoding="ISO-8859-1"?>
<dummy value="X" test="aöüß
SQL>
-----
So yes, the german umlaut got encoded to UTF-8 (guess so), but what about the encoding attribute?
Wether or not this is specified in the xml it does not get converted. So the resulting xml docs are not correct...
Does anybody know what I need to do to get this right?
Thanks,
Stefan