I have a strange edge case where a regular expression fails if I change languages. This works fine:
alter session set nls_language = 'ENGLISH';
select regexp_count( 'foo', '[Ø-ö]' ) as x from dual;
This raises ORA-12728, "Invalid range in regular expression:
alter session set nls_language = 'FRENCH';
select regexp_count( 'foo', '[Ø-ö]' ) as x from dual;
It looks like regular expression ranges can be influenced by the language setting, specifically NLS_SORT:
In the POSIX standard, a range includes all collation elements between the start and end of the range in the linguistic definition of the current locale. Thus, ranges are linguistic rather than byte values ranges; the semantics of the range expression are independent of character set. In Oracle Database, the linguistic range is determined by the NLS_SORT
initialization parameter.
I can expand the range to all the characters involved, but what does the range **[Ø-ö]**
actually refer to? Is it just U+00D8 to U+00F6, but French orders those differently?