Hello,
we are trying to index some documents(BLOB) using oracle text, the documents are in PDF format, usually created with apache fop.
This is the user that we have created:
CREATE USER MY_USER
IDENTIFIED BY VALUES MY_USER
DEFAULT TABLESPACE TEST
TEMPORARY TABLESPACE TEMP
PROFILE DEFAULT
ACCOUNT UNLOCK;
-- 3 Roles for MY_USER
GRANT CTXAPP TO MY_USER WITH ADMIN OPTION;
GRANT RESOURCE TO MY_USER WITH ADMIN OPTION;
GRANT CONNECT TO MY_USER WITH ADMIN OPTION;
ALTER USER MY_USER DEFAULT ROLE ALL;
-- 1 System Privilege for MY_USER
GRANT UNLIMITED TABLESPACE TO MY_USER WITH ADMIN OPTION;
-- 8 Object Privileges for MY_USER
GRANT EXECUTE ON CTXSYS.CTX_CLS TO MY_USER;
GRANT EXECUTE ON CTXSYS.CTX_DDL TO MY_USER;
GRANT EXECUTE ON CTXSYS.CTX_DOC TO MY_USER;
GRANT EXECUTE ON CTXSYS.CTX_OUTPUT TO MY_USER;
GRANT EXECUTE ON CTXSYS.CTX_QUERY TO MY_USER;
GRANT EXECUTE ON CTXSYS.CTX_REPORT TO MY_USER;
GRANT EXECUTE ON CTXSYS.CTX_THES TO MY_USER;
GRANT EXECUTE ON CTXSYS.CTX_ULEXER TO MY_USER;
And this is the scripts that we have executed with the above user.
begin
ctx_ddl.create_preference('my_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('my_lexer', 'printjoins', '_-');
ctx_ddl.set_attribute('my_lexer', 'base_letter', 'Yes');
ctx_ddl.set_attribute('my_lexer', 'mixed_case', 'No');
end;
begin
ctx_ddl.create_preference('MY_STORAGE','BASIC_STORAGE');
ctx_ddl.set_attribute('MY_STORAGE', 'I_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'K_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'R_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'N_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
ctx_ddl.set_attribute('MY_STORAGE', 'P_TABLE_CLAUSE', 'tablespace TEST storage (initial 10M)');
end;
And here is the creation of the index:
create index DOC_I on DOC(DOCUMEMT)
indextype is ctxsys.context parameters
('lexer my_lexer storage MY_STORAGE);
As soon as we create the index, and we look at the view
CTX_INDEX_ERRORS there is a new error:
DRG-11207: user filter command exited with status 2.
DRG-11225: Third-party filter timed out.
For each document who is in that table.
We have tried to use an external filter, XPDF.
We have put the executable from XPDF in this folder: /opt/oracle/product/10.1.0/db_1/ctx/bin (we are using linux), and executed this script.
begin
ctx_ddl.create_preference('MY_FILTER', 'USER_FILTER');
ctx_ddl.set_attribute('MY_FILTER','COMMAND','pdftotext');
end
But this time as soon as we create the index with this filter in the view we encounter this error alone:
DRG-11207: user filter command exited with status 2.
We have tried to execute the ctxhx and pdftotext from linux and they convert the files.
Thanks.