Replacing non-ASCII characters with HTML charcter references
Hi All,
In Oracle 10g or greater is there a built-in function that will convert a string with non-ASCII characters like this
a b č 뮼
into an ASCII string with HTML character references like this?
a b & # x 0 1 0 D ; & # x B B B C ;
(note I had to include spaces between each character in the sample code for message to prevent the forum software from converting my text)
I tried using
utl_i18n.escape_reference( val, 'us7ascii' )
but for some reason it returns
a b c & # x B B B C ;
Note how it converted the Western European character "č" to its unaccented counterpart "c", not "& # x 0 1 0 D ;" (is this a bug?).
I also tried a custom solution using regexp_replace and asciistr (which I can't include here because the forum software chokes on it) but it only returns the correct result for values <=4000 characters long. Unfortunately asciistr doesn't appear to accept CLOB values larger than 4000 characters. It returns an error message like
(ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 30251, maximum: 4000) ).
I'm looking for a solution that works on CLOB data of any size.
Thanks in advance for any insight you can provide.
--
Joe Fuda