Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

problem indexing documents with oracle enterprise 10.2.0.4

700172May 7 2009 — edited May 11 2009
Hello,

we are trying to index some documents(BLOB) using oracle text, the documents are in PDF format, usually created with apache fop.

This is the user that we have created:
CREATE USER MY_USER
  IDENTIFIED BY VALUES MY_USER
  DEFAULT TABLESPACE TEST
  TEMPORARY TABLESPACE TEMP
  PROFILE DEFAULT
  ACCOUNT UNLOCK;
  -- 3 Roles for MY_USER
  GRANT CTXAPP TO MY_USER WITH ADMIN OPTION;
  GRANT RESOURCE TO MY_USER WITH ADMIN OPTION;
  GRANT CONNECT TO MY_USER WITH ADMIN OPTION;
  ALTER USER MY_USER DEFAULT ROLE ALL;
  -- 1 System Privilege for MY_USER
  GRANT UNLIMITED TABLESPACE TO MY_USER WITH ADMIN OPTION;
  -- 8 Object Privileges for MY_USER
    GRANT EXECUTE ON  CTXSYS.CTX_CLS TO MY_USER;
    GRANT EXECUTE ON  CTXSYS.CTX_DDL TO MY_USER;
    GRANT EXECUTE ON  CTXSYS.CTX_DOC TO MY_USER;
    GRANT EXECUTE ON  CTXSYS.CTX_OUTPUT TO MY_USER;
    GRANT EXECUTE ON  CTXSYS.CTX_QUERY TO MY_USER;
    GRANT EXECUTE ON  CTXSYS.CTX_REPORT TO MY_USER;
    GRANT EXECUTE ON  CTXSYS.CTX_THES TO MY_USER;
    GRANT EXECUTE ON  CTXSYS.CTX_ULEXER TO MY_USER;
And this is the scripts that we have executed with the above user.
begin 
  ctx_ddl.create_preference('my_lexer', 'BASIC_LEXER'); 
  ctx_ddl.set_attribute('my_lexer', 'printjoins', '_-');
  ctx_ddl.set_attribute('my_lexer', 'base_letter', 'Yes');
  ctx_ddl.set_attribute('my_lexer', 'mixed_case', 'No');
end; 

begin
ctx_ddl.create_preference('MY_STORAGE','BASIC_STORAGE');
ctx_ddl.set_attribute('MY_STORAGE', 'I_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'K_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'R_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'N_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'P_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
end;
And here is the creation of the index:
create index DOC_I on DOC(DOCUMEMT) 
  indextype is ctxsys.context parameters 
    ('lexer my_lexer storage MY_STORAGE);
As soon as we create the index, and we look at the view CTX_INDEX_ERRORS there is a new error:

DRG-11207: user filter command exited with status 2.
DRG-11225: Third-party filter timed out.

For each document who is in that table.

We have tried to use an external filter, XPDF.

We have put the executable from XPDF in this folder: /opt/oracle/product/10.1.0/db_1/ctx/bin (we are using linux), and executed this script.
begin
  ctx_ddl.create_preference('MY_FILTER', 'USER_FILTER');
  ctx_ddl.set_attribute('MY_FILTER','COMMAND','pdftotext');
end
But this time as soon as we create the index with this filter in the view we encounter this error alone:

DRG-11207: user filter command exited with status 2.

We have tried to execute the ctxhx and pdftotext from linux and they convert the files.

Thanks.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jun 8 2009
Added on May 7 2009
2 comments
2,409 views