Hey everyone,
Hoping to get some help understanding NVARCHAR behaviour and such; I haven't used them much in the past.
I'm on: 11.2.0.3.0
Background:
- existing database:
NLS_CHARACTERSET WE8ISO8859P1
NLS_NCHAR_CHARACTERSET UTF8
Most columns use "normal" VARCHAR2 data type, and I'm generally ok with that.
However, they used NVARCHAR2 data type for "french" column data (despite the fact the WE8ISO8859P1 character set they use supports french. *sigh* )
Anyway ... issue isn't even with a french character. I believe it's a Window's "Smart" quote.
Issue:
Running a query to get a dump of the "bad" data from production, shows the character ( a quote of some sort: ‘ not sure if that will actually transfer across properly), as either ASCII code 49810 or 49809 (we see both)
So trying to setup a "test" in development to reproduce, and I can't seem to do it.
create table junk ( id number, vv nvarchar2(100) );
insert into junk values ( 1, chr(49809) );
insert into junk values ( 2, chr(49810) );
insert into junk values ( 3, chr(96) );
commit;
select id, vv, dump(vv) from junk;
Both SQL Developer (via Windows) and SQL*Plus (via Unix), show same thing:
ID VV DUMP(VV)
---------- ---------------------------------------- --------------------------------------------------
1 ? Typ=1 Len=2: 0,145
2 ? Typ=1 Len=2: 0,146
3 ` Typ=1 Len=2: 0,96
3 rows selected.
In both cases, the "smart" quotes didn't store properly and appear to be "converted".
I'm not really sure what's going on, however, I was trying to figure out how to set my NLS_LANG, but not sure what to set it to?
In Unix, I tried:
export NLS_LANG=American_America.UTF8
however, That changes the code showing from "?" to "Â" (re-ran the entire script, same code: 145/146 stored, so still losing it on insert, I assume)
So I suspect it's a case of that character not actually supported by UTF8, right?
For SQL Developer, not as much luck .. I set a Windows env variable, NLS_LANG to same as above, however, it still shows in SQL Developer as an empty square.
Questions:
1) Just to verify, I'm not even sure that these characters (ie 49810 and 49809) are actually supported by UTF8 ?? (did some searching, but couldn't find anything that could confirm it for certain .. )
2) How to (properly) set NLS_LANG for SQL Developer, and what to set it to so I can read/write characters in these pesky NVARCHAR fields?
3) how to enter (even forcefully) - the characters 49809 or 49810 ?? (for testing purposes only! )
(FYI: This is mostly for me learning. The "solution" to our initial problem is to convert these bad characters to "normal" quotes: ie ASCII code 39. Of course, being able to test the update properly would be really really nice, hence I need to "force" entry of some bad data into dev
Thanks!