Yet another character encoding question
Hello all,
I need to clarify some things to help me understand the problems I am encountering with my XML documents generated in Oracle.
I am using 10g (though must be 9i compatible) to create large XML docs (1-4 meg). I have not put in an <?XML?> prolog yet, but from what I understand, 10g would just ignore this anyway, so I think I am right in thinking that is not my issue.
I am getting these documents as CLOBs from a PL/SQL function I have created in the database. The CLOBS themselves seem fine, but I get character errors in IE (I use the mu symbol fairly extensively, there may be other potential problem characters). I think this is because although I want UTF-8, the documents themselves are not encoded in UTF-8 but instead with my database (or client) character set?
If I changed my client to UTF-8, would I then get my CLOBs handed to me pre-encoded?At the moment, I have discovered if I open my documents in Notepad and re-save them selecting UTF-8 encoding, the problem disappears.
So, in summary:
1) Can I put a prolog in at time of generation? The string concatenation workaround won't do it for me because of the size of my docs
2) Is my data encoded according to the client or database character set? I'm pretty sure it's client, but would just like confirmation
3) Is there any way to get my CLOB data to come back as UTF-8 without having to change client/db character sets?
4) If not, would it be safe to take it out of the database in whatever character set is currently used and then convert it afterwards, effectively a programmatic version of what I'm currently having to do with Notepad?
(I'm currently using ENGLISH_UNITED KINGDOM.WE8MSWIN1252 but my customers could be using anything on both client or db)