Fuzzy problems
589591Jul 24 2007 — edited Jul 24 2007Hello to all group members,
we are having a database with medical content like diseases.
We are performing search queries on this data.
Especially disease names should be searched for.
We have introduced an additional column "preprocessed disease name" which contains the disease name without blanks and special characters. This column has an associated Oracle Text context index.
We have implemented a fuzzy search on this additional column.
The search results are not acceptable right now.
Example:
Content of column "disease name":
1: Diabetes Mellitus
2: Diabetes Mellitus Type 1
3: Diabtetes Mellitus Type 2
4: Gastritis
Content of additional column "preprocessed disease name":
1: diabetesmellitus
2: diabetesmellitustype1
3: diabetesmellitustype2
4: gastritis
Examples of fuzzy queries on column "preprocessed disease name" (only the fuzzy search term is given):
q1: %diabetis% => no results
q2: %diabetis%melitus% => 1 is returned
q3: %diabetes%melitus%tape% => 1,2 and 3 is returned
q4: %gastrates% => 4 is returned
We have expected q1, q2 and q3 to return data rows 1,2 and 3 but only q3 does so.
q4 works as expected.
It seems as if word length is the problem here.
Any ideas or directions where to look for a solution to this are greatly appreciated.
Thanks in advance and best regards
Patrick Baer