Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Oracle error when parsing XML file with HTML escape characters

495997Dec 14 2011 — edited Dec 15 2011
Hi,

In Oracle 10g (10.2.x) and 11g (11.2.x), I receive the following error when trying to load an XML documents off a staging table:
ERROR:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00217: invalid character 8211 (U+2013)
Error at line 1
ORA-06512: at "SYS.XMLTYPE", line 301
ORA-06512: at line 1

It seems Oracle is having an issue parsing HTML escape characters. The above error is easy to recreate with a simple example:

ADAVEY@easdev>select xmltype('<comment>Some text then & #8211; more text</comment>')
2 from dual;


So as it turns out Oracle is choking on the "& #8211;" string. NOTE that I have added a space between the & and # symbols so that this forum can display the string properly.

Is there any setting or flag that can be enabled before trying to load the document via the xmltype() function such that Oracle treats this (and similar) values as a literal? We don't have control over the contents of the files that we are receiving as they come from multiple vendors. I have validated the XML file with other parsers and none of them report any error with this data. So I'm not sure how to work around this issue with Oracle's XML parser.

Thanks,

Alan
This post has been answered by odie_63 on Dec 14 2011
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jan 12 2012
Added on Dec 14 2011
8 comments
2,050 views