Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Oracle Text - CTX Context Index Soundex Problem

user6336927Sep 16 2013 — edited Sep 19 2013

Hi,

I'm running into a problem with Oracle Text when searching using the ! (soundex) option. I've created a simple test example to highlight the issue.

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit

Windows 2008 Server 64-bit

create table test_tab (test_col  varchar2(200));

insert all

  into test_tab (test_col) values ('ab-tönes')

  into test_tab (test_col) values ('ab-tones')

  into test_tab (test_col) values ('abtones')

  into test_tab (test_col) values ('ab tones')

  into test_tab (test_col) values ('ab-tanes')

  select * from dual

  /

select * from test_tab

/

begin

      ctx_ddl.create_preference ('test_lex1', 'basic_lexer');

      ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+&''');

      ctx_ddl.set_attribute('test_lex1','base_letter','YES');

      -- ctx_ddl.set_attribute('test_lex1','skipjoins','-');

end;

/

create index test_idx on test_tab (test_col)

  indextype is ctxsys.context

    parameters

      ('lexer        test_lex1'     

      )

    /

select token_text from dr$test_idx$i;

TOKEN_TEXT

----------

AB

ABTONES

TANES

TONES

select * from test_tab where contains (test_col, '!ab tones') > 0;

TEST_COL

--------

ab-tönes

ab-tones

ab tones

select * from test_tab where soundex(test_col) = soundex('ab tones');

TEST_COL

--------

ab-tönes

ab-tones

abtones

ab tones

ab-tanes

So my question is, can anyone suggest an approach whereby I can get the Oracle Text Context index (or CTXCAT index if it's more appropriate) to return all 5 rows like the simple Soundex is doing?

I can't really use soundex as this search query will form part of a search screen for a multi-language application. Soundex is limited to English sounding words, so I need the solution to be able to compare strings that may not "sound" English.

It must be an attribute of the BASIC_LEXER, and I've tried skipjoins, start/end-joins, stop lists, but I just cannot get the Soundex feature of Oracle Text to function like the SOUNDEX() function!

Looking at how the tokens are stored dr$test_idx$i I need Oracle Text to almost concat 'AB' and 'TONES' to search as a single string.

Any help greatly appreciated.

Thanks,

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Oct 17 2013
Added on Sep 16 2013
11 comments
5,466 views