Hi,
I'm running into a problem with Oracle Text when searching using the ! (soundex) option. I've created a simple test example to highlight the issue.
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit
Windows 2008 Server 64-bit
create table test_tab (test_col varchar2(200));
insert all
into test_tab (test_col) values ('ab-tönes')
into test_tab (test_col) values ('ab-tones')
into test_tab (test_col) values ('abtones')
into test_tab (test_col) values ('ab tones')
into test_tab (test_col) values ('ab-tanes')
select * from dual
/
select * from test_tab
/
begin
ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+&''');
ctx_ddl.set_attribute('test_lex1','base_letter','YES');
-- ctx_ddl.set_attribute('test_lex1','skipjoins','-');
end;
/
create index test_idx on test_tab (test_col)
indextype is ctxsys.context
parameters
('lexer test_lex1'
)
/
select token_text from dr$test_idx$i;
TOKEN_TEXT
----------
AB
ABTONES
TANES
TONES
select * from test_tab where contains (test_col, '!ab tones') > 0;
TEST_COL
--------
ab-tönes
ab-tones
ab tones
select * from test_tab where soundex(test_col) = soundex('ab tones');
TEST_COL
--------
ab-tönes
ab-tones
abtones
ab tones
ab-tanes
So my question is, can anyone suggest an approach whereby I can get the Oracle Text Context index (or CTXCAT index if it's more appropriate) to return all 5 rows like the simple Soundex is doing?
I can't really use soundex as this search query will form part of a search screen for a multi-language application. Soundex is limited to English sounding words, so I need the solution to be able to compare strings that may not "sound" English.
It must be an attribute of the BASIC_LEXER, and I've tried skipjoins, start/end-joins, stop lists, but I just cannot get the Soundex feature of Oracle Text to function like the SOUNDEX() function!
Looking at how the tokens are stored dr$test_idx$i I need Oracle Text to almost concat 'AB' and 'TONES' to search as a single string.
Any help greatly appreciated.
Thanks,