Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Oracle text search - special characters issue

716687Aug 11 2009 — edited May 28 2013
Hi.

I'm facing a real annoying problem with text search query, and everything I've tried failed...

I have a table with a varchar column indexed by text index. The column contains special characters like '&', ',' and mainly- '-'. Since I want to disregard these special characters for searches I have created a basic lexer of type skipjoins for the column index. So now, the phrase 'aaa-bbb something'. for example, can be searched without '-', like this: 'aaabbb'. But I want to make it possible for this phrase to be searched with and without '-'. So, that when the user enters 'aaabbb' he will get the same results as when he enters 'aaa-bbb'.

In other words, This condition:
WHERE CONTAINS(column, '<query> <textquery grammar="context"> <progression><seq>'
||'aaabbb'
||'</seq></progression> </textquery> </query> ' ,1)> 2

Will return the same results as this condition:
WHERE CONTAINS(r.POI_NAME, '<query> <textquery grammar="context"> <progression><seq>'
||'aaa-bbb'
||'</seq></progression> </textquery> </query> ' ,1)> 2

Since text query treats the '-' sign as a minus sign and searches for 'aaa' which doesn't contain 'bbb', the only way I found to fix this was to wrap the search text with {}. like this:

WHERE CONTAINS(r.POI_NAME, '<query> <textquery grammar="context"> <progression><seq>'
||'{aaa-bbb}'
||'</seq></progression> </textquery> </query> ' ,1)> 2

This all went very well, until I wanted to create a relaxation query. like this:

WHERE CONTAINS(r.POI_NAME, '<query> <textquery grammar="context"> <progression><seq>'
||'{aaab}'
||'</seq><seq>'
||'{aaab}'
||'%</seq></progression> </textquery> </query> ' ,1)> 2

In this case, I would expect the first part of the query to return no results (since it's not the whole word) but the second part, using '%' should have returned the record of 'aaa-bbb'. It doesn't. It will only return my result if I remove the '{}' for the second part. I can't do that, because the exact same search, when containing '-', will not return the expected results when I remove the braces (the sign is treated as minus sign):

WHERE CONTAINS(r.POI_NAME, '<query> <textquery grammar="context"> <progression><seq>'
||'{aaab}'
||'</seq><seq>'
||'aaa-b'
||'%</seq></progression> </textquery> </query> ' ,1)> 2

So I now have no solution. My question is- How can I create a query that will disregard the minus sign and treat it as a regular sign, but would still handle percentage sign as a special sign. So that I could run a query like the last example and will get the results of searching the phrase 'aaa-b%'?

In short, and to simplify my question, I'm looking for a way to escape all characters (not only the minus sign) except for a specific character. Kind of like 'unescaping' a specific character (the '%' sign) within braces {}. Or, another way would be to remove the space that is added to the phrase inside the braces at the end of the word, preventing me from adding "%" at the end of the word, outside the braces.

Thanks you,
Nili
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jun 25 2013
Added on Aug 11 2009
15 comments
4,915 views