Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Searching for numbers with decimal points with CONTAINS

tarfu_dbaFeb 19 2016 — edited Feb 19 2016

One of our apps uses a Text CONTEXT index.  Some of the terms looked for are numbers, specifically numbers with decimal points.  I see this behaviour below  and was wondering if someone can explain the logic behind why it works this way:

SQL> select * from numbers;

   ID TEXT

----- ----------------

    1 10.56

    2 10.99

    3 11.99

    4 12.99

    5 13.99

    6 14.99

    7 15.99

    8 16.99

    9 17.99

SQL> exec ctxsys.ctx_ddl.create_preference('my_lexer','WORLD_LEXER');    -- we have to use this lexer...no option

SQL> create index numbersi on numbers(text) indextype is ctxsys.context parameters ('lexer my_lexer');

SQL> select token_type, token_text from DR$NUMBERSI$I;

TOKEN_TYPE TOKEN_TEXT

---------- ----------------------------------------------------------------

         0 10.56

         0 10.99

         0 11.99

         0 12.99

         0 13.99

         0 14.99

         0 15.99

         0 16.99

         0 17.99

-- A user wants to find rows where someone paid $10.xx for stuff.  Notice the "." in the search term, and no white space before the "%".

SQL> select * from numbers where contains(text,'10.%')>0;

no rows selected

-- get rid of the decimal point...

SQL> select * from numbers where contains(text,'10%')>0;

        ID TEXT

---------- ----------------------------------------------------------------

         1 10.56

         2 10.99

-- EXPLAIN shows this splitting of the search terms in the 2 cases above:

SQL> exec ctx_query.explain(index_name => 'numbersi', text_query => '10.%', explain_table => 'test_explain');

PL/SQL procedure successfully completed.

SQL> select id, parent_id, operation, object_name from test_explain;

        ID  PARENT_ID OPERATION                      OBJECT_NAME

---------- ---------- ------------------------------ ------------------------------

         1          0 PHRASE

         2          1 WORD                           10

         3          1 WORD                           %

SQL> exec ctx_query.explain(index_name => 'numbersi', text_query => '10%', explain_table => 'test_explain');

PL/SQL procedure successfully completed.

SQL> select id, parent_id, operation, object_name from test_explain;

        ID  PARENT_ID OPERATION                      OBJECT_NAME

---------- ---------- ------------------------------ ------------------------------

         1          0 EQUIVALENCE                    10%

         2          1 WORD                           10.56

         3          1 WORD                           10.99

My theory is that a trailing decimal point, aka period, is treated as an end of sentence character, which is punctuation, and it gets replaced by a blank; this makes a search expression'10.%' look like '10 %', which is the phrase with word "10" followed by any other word. Since there are no following words in my TEXT column, nothing matches this phrase. I see the same behaviour if there is an explicit blank character in the expression:

SQL> exec ctx_query.explain(index_name => 'numbersi', text_query => '10. %', explain_table => 'test_explain');  -- <<<< notice the space AFTER the "."

PL/SQL procedure successfully completed.

SQL> select id, parent_id, operation, object_name from test_explain;

        ID  PARENT_ID OPERATION                      OBJECT_NAME

---------- ---------- ------------------------------ ------------------------------

         1          0 PHRASE

         2          1 WORD                           10

         3          1 WORD                           %

Why is that period removed in such a manner that the expression gets split into 2 terms when there is no whitespace between the "." and the "%"?

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Mar 18 2016
Added on Feb 19 2016
2 comments
1,671 views