Skip to Main Content

APEX

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Translation - XLF file with the UTF-8 BOM at the head of the file

JustinCaveAug 10 2011 — edited Aug 12 2011
Using APEX 3.2, I'm trying to get translations loaded for an application that I'm working on. We are going through the process of generating XLIFF files, sending them to a translator to translate, and then importing the data back in rather than manually editing the translations.

This works perfectly when I use my text editor to enter the (very poor) translations in the file. Our translator, however, uses a tool that adds a byte-order mask BOM at the head of the XLIFF file when it saves it in order to identify it as a UTF-8 encoded file. This appears to be a valid way for the tool to indicate the encoding being used though, obviously, the encoding in the header overrides it and the BOM is not the preferred method of specifying the encoding of an XML file. However, when we import the file in APEX, and hit the Apply XLIFF button, we get an error

ORA-31011: XML parsing failed ORA-19202: Error occurred in XML processing LPX-00210: expected '<' instead of '¿' Error at line 1

The third byte of the UTF-8 BOM (0xEF,0xBB,0xBF), if interpreted as an ISO-8859-1, would be the upside down question mark character in the error message. So it appears that the APEX importer is interpreting the BOM using the ISO-8859-1 character set and throwing an error that the file is not valid. That is not my reading of the proper XML parser behavior.

Obviously, we can work around the problem either by asking our translator to use a different tool or by opening up the files we get back in a hex editor and removing the BOM. Is there anything we can do to resolve the issue that doesn't involve adding an additional manual step to the process or the translator changing the tool being used? Or is there something I'm missing that would indicate that a file with the BOM in the first three bytes should not be considered a valid XML file?

Justin
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Sep 9 2011
Added on Aug 10 2011
5 comments
829 views