Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Performance MULTI_COLUMN_DATASTORE and BASIC_SECTION_GROUP

WGabrielFeb 2 2010 — edited Feb 16 2010
In the following example there is a table with 3 text columns. There is a MULTI_COLUMN_DATASTORE
for all these columns. For each column there is a field section in a BASIC_SECTION_GROUP:
    ...   
  CTX_DDL.CREATE_PREFERENCE ('my_pref', 'MULTI_COLUMN_DATASTORE');
  CTX_DDL.SET_ATTRIBUTE     ('my_pref', 'COLUMNS', 'col1, col2, col3' );
  ...
  CTX_DDL.CREATE_SECTION_GROUP ('my_section', 'BASIC_SECTION_GROUP');

  CTX_DDL.ADD_FIELD_SECTION ('my_section', 'col1',   'col1',   TRUE);
  CTX_DDL.ADD_FIELD_SECTION ('my_section', 'col2',   'col2',   TRUE);
  CTX_DDL.ADD_FIELD_SECTION ('my_section', 'col3',   'col3',   TRUE);
The table CTXSYS.DR$SECTION then contains three rows.
Each row in the own index table $I is labled with the TOKEN_TYPE according DR$SECTION:
  Table CTXSYS.DR$SECTION:

  SEC_FID = 16  'col1'
  SEC_FID = 17  'col2'
  SEC_FID = 18  'col3'
    
  Table DR$INDEX_FUZZY$I:

  TOKEN_TYPE = 16 / 17 / 18
 
CREATE INDEX idx_fuzzy ON mytable (dummycol)
 INDEXTYPE IS CTXSYS.CONTEXT
 PARAMETERS ('... SECTION GROUP ctxsys.my_section ... );
The query looks like this: search a text string in the column col1:
exec :bnd := '?mysearch% WITHIN col1';

select * 
 from mytable
where contains(dummycol, :bnd, 0 ) > 0;
A tracing analysis using tkprof shows, that there is a query like this, which is the most time consuming query (total cpu time here: 44 sec.):
SELECT/*+INDEX(T "DR$IDX_FUZZY$X")*/ DISTINCT TOKEN_TEXT FROM
  "DR$IDX_FUZZY$I" T WHERE TOKEN_TEXT LIKE :lkexpr and    (TOKEN_TYPE =
   0 OR TOKEN_TYPE = 4 OR TOKEN_TYPE BETWEEN 16 AND 74)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      115      0.00       0.01          0          0          0           0
Execute   4821      0.08       0.08          0          0          0           0
Fetch     8599     43.16      44.00       8013     633750          0     4257613
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    13535     43.24      44.09       8013     633750          0     4257613
The question: why is there still a range scan like "*TOKEN_TYPE BETWEEN 16 and 74*" ?

This means that the query searches the complete INDEX table. All types will be included ( TOKEN_TYPE = 16 / 17 / 18 ), but

the query is only interested in TOKEN_TYPE = 16 ( i.e. column col1 ).

Comments

Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Mar 16 2010
Added on Feb 2 2010
2 comments
2,634 views