Hi All,
I would need your help to understand how conversion between client and database works.
I am running on following environment:
Oracle Linux Server release 7.2
SQL> select * from v$version;
BANNER CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Standard Edition Release 12.1.0.2.0 - 64bit Production 0
PL/SQL Release 12.1.0.2.0 - Production 0
CORE 12.1.0.2.0 Production 0
TNS for Linux: Version 12.1.0.2.0 - Production 0
NLSRTL Version 12.1.0.2.0 - Production 0
SQL> SELECT * FROM V$NLS_PARAMETERS;
PARAMETER VALUE CON_ID
---------------------------------------------------------------- ---------------------------------------------------------------- ----------
NLS_LANGUAGE AMERICAN 0
NLS_TERRITORY AMERICA 0
NLS_CURRENCY $ 0
NLS_ISO_CURRENCY AMERICA 0
NLS_NUMERIC_CHARACTERS ., 0
NLS_CALENDAR GREGORIAN 0
NLS_DATE_FORMAT DD-MON-RR 0
NLS_DATE_LANGUAGE AMERICAN 0
NLS_CHARACTERSET AL32UTF8 0
NLS_SORT BINARY 0
NLS_TIME_FORMAT HH.MI.SSXFF AM 0
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM 0
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR 0
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR 0
NLS_DUAL_CURRENCY $ 0
NLS_NCHAR_CHARACTERSET AL16UTF16 0
NLS_COMP BINARY 0
NLS_LENGTH_SEMANTICS BYTE 0
NLS_NCHAR_CONV_EXCP FALSE 0
19 rows selected.
I have a file with following content:
;RÄ;E;
It is a 6B file containing six 8bit encoded characters. German character "Ä" is C4 hexa.
When I login to OS NLS_LANG is set to the same character set as database character set
echo $NLS_LANG
AMERICAN_AMERICA.AL32UTF8
Because I have 8bit encoded file, I want to change it to the file character set:
export NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P15
Then I connect to the database. To be on safe side, I check the NLS_LANG setting from sqlplus
SQL> host echo $NLS_LANG
AMERICAN_AMERICA.WE8ISO8859P15
Then I run following script:
DECLARE
v_FileHandle UTL_FILE.FILE_TYPE;
vvc_Line VARCHAR2(1000);
i NUMBER := 1;
BEGIN
v_FileHandle := UTL_FILE.FOPEN('SOME_DIRECTORY', 'file.txt', 'r');
UTL_FILE.GET_LINE(v_FileHandle, vvc_Line);
UTL_FILE.FCLOSE(v_FileHandle);
DBMS_OUTPUT.PUT_LINE('LENGTH: '||LENGTH(vvc_Line));
DBMS_OUTPUT.PUT_LINE('LENGTHB: '||LENGTHB(vvc_Line));
--
i := INSTR(vvc_Line, ';', 1, 1);
WHILE i > 0 LOOP
DBMS_OUTPUT.PUT_LINE(i);
DBMS_OUTPUT.PUT_LINE(SUBSTR(vvc_Line, i+1, INSTR(vvc_Line, ';', i+1, 1) - (i+1)));
i := i+1;
i := INSTR(vvc_Line, ';', i, 1);
END LOOP;
END;
/
and I get following output:
LENGTH: 5
LENGTHB: 6
1
R▒E
5
From the output it is clear that the file is treated as UTF8 encoded file. Length in characters is 5 because Oracle thinks that letters "Ä;" is one letter because german letter "Ä" is above 128 bit ASCII table...
I would suppose that when changing client character set to character set that corresponds with file character set would mean that Oracle will do the conversion from WE8ISO8859P15 to AL32UTF8 while reading the file. So I would suppose that this function call:
UTL_FILE.GET_LINE(v_FileHandle, vvc_Line);
will read the line from the file in 8bit encoding, convert it to UTF8 and fill the value to variable vvc_Line.
Am I missing something or do I need to restart something after changing character set on client?
Thanks a lot for your comments.
;RÄ;E;