Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Loading MS Word Documents for Conversion to DB Records?

443706Jan 28 2009 — edited Feb 11 2009
Here's what I'm working with...
Oracle 10.2.0.3.x (w/ companion cd installed)
Fedora Core 7
4 Dual Core Procs.
16G RAM

For the longest time, our intranet has accepted uploaded word files and converted them to PDFs and associated each file with a db record that stores the file location on disk, name of document, owner, and a few other details. The files are typically well formatted, each having a line for the title, effective date, author, owner, etc.

We've reached a point where the PDF conversion isn't working out too well. Plus, our search via a google mini returns outdated files or files that haven't been published yet. Basically, we need a better solution.

What I would like to do is...
1. use oracle text to scan/load/import the word files,
2. use oracle text to parse the word files identifying elements such as the effective date, owner, etc.
3. load the results of 2 into a table that is searchable by oracle. I can use my intranet app to convert the data to word or pdf or whatever the user needs.

I'm here because I a) don't know if this is possible, b) how exactly to go about accomplishing the task.

I would truly appreciate any tips, pointers or links to helpful documentation.
Thank you.
This post has been answered by 12324-Oracle on Feb 10 2009
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Mar 11 2009
Added on Jan 28 2009
9 comments
2,112 views