Skip to Main Content

SQLcl: MCP Server & SQL Prompt

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

SQLcl LOAD command does not handle data containing enclosures correctly

Erik van RoonFeb 9 2023 — edited Feb 9 2023

SQLcl, current version = 22.4

Just ran into (noticed?) this problem with the load command in SQLcl for the first time, but I reproduced it all the way back to version 17.2 , so It's safe to say it's always been this way:

If data in a table contains characters that are set to be used as enclosure characters by the UNLOAD and LOAD commands, the UNLOAD command will correctly escape the characters when writing them to the output file.
So, suppose enclosure character is double quote, and a column contains the value

there are "double quotes" in the data

then this value will be written to the file as

"there are ""double quotes"" in the data"

(Note the escaping by making them double-double quotes)

But when you try to load that same file using the LOAD command, those doubled double quotes are treated as actual data, not as escaped double quotes.
So what you will end up with in your table, after LOAD is:

there are ""double quotes"" in the data

That is, if it fits in the column. I found out when I got errors for data that was suddenly to long for the target columns.
But it's worse if it *does* fit in the column, because then you just have a successful load, but your data is now malformed

How many times have I already done loads for which I didn't realize the data that got into the table was wrong?

Demo

Create table for load/unload

and insert data with and without double quotes, and with maximum length and shorter than that

create table ero_load_test
as
select level    as id
     , case level
         when 1 then 'short with "quoted" text'
         when 2 then 'short without quoted text'
         when 3 then 'max length, without any quoted text'
         when 4 then 'max length, including "quoted" text'
       end      as text
  from dual
connect by level <= 4
;

Check table and its data

desc ero_load_test

select t.*
     , length(t.text)   as text_length
  from ero_load_test  t
;

UNLOAD the data and check the file

I use he unload command with all configuration set to their defaults, so enclosure character will be double quote

unload ero_load_test

host type ERO_LOAD_TEST_DATA_TABLE.csv

And as we expect the double quotes in the values are escaped

Clear the table and re-load the data

When we truncate the table and then use LOAD to load the data in the file back into the table we run into an error:

truncate table ero_load_test;

load ero_load_test ERO_LOAD_TEST_DATA_TABLE.csv

Row number 4 does not fit because, including the extra double quotes used to escape the original ones, the text value has a length of 37, which obviously doesn't fit into a column with a width of 35.

Check data in table

If we check the data that IS loaded into the table…..

select t.*
     , length(t.text)   as text_length
  from ero_load_test  t
;

….. we can see that the value with double quotes that was short enough to be able to grow was inserted with double the amount of double quotes.

This leads to the conclusion that it's not a length check that happens too early (before stripping the escaping characters), but that the escaping characters are not stripped at all.

Finally cleanup

host del ERO_LOAD_TEST_DATA_TABLE.csv
drop table ero_load_test purge;

Comments
Post Details
Added on Feb 9 2023
1 comment
571 views