Skip to Main Content

Java Programming

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Matcher replaceAll vs String replaceAll and + plus sign anomaly?

807591May 23 2008 — edited May 23 2008
I have code that is to highlight search terms found within a given string.

I searched within a given string "Java, C++, C, etc." for the term "C++" and have really strange results.

Below is a snippet.
 // Now the highlighting portion               
            /** The temporary start tag, just some uncommon Unicode character. */
            final String START_TAG = Character.toString( (char)0x0141 );
            /** The temporary end tag. */
            final String END_TAG = Character.toString( (char)0x0142 );
            String highlighted = result.toString();
            for (int m = 0; m < foundPhrases.size(); m++) {
                String phrase = foundPhrases.get(m);
             
                String pattern = "[^" + START_TAG + "](" + phrase + ")";
                Pattern firstMatchedPattern = Pattern.compile( pattern, Pattern.CASE_INSENSITIVE );
                Matcher matcher = firstMatchedPattern.matcher( highlighted );
                
                while( matcher.find( ) ) {
                    String foundPhrase = matcher.group( 1 );
             
                    StringBuffer hilitebuffer = new StringBuffer( START_TAG );
                    hilitebuffer.append( foundPhrase ).append( END_TAG );
                       
                    System.out.println("hilitebuffer = " + hilitebuffer.toString());
              
                    //highlighted = matcher.replaceAll(hilitebuffer.toString()); 
                    highlighted = highlighted.replaceAll( foundPhrase, Matcher.quoteReplacement(hilitebuffer.toString( ))); also cannot handle +
                    System.out.println("highlighted= " + highlighted);
                }
                
                highlighted = highlighted.replaceAll( START_TAG,
                      CoreServices.highlightTagStart );
                highlighted = highlighted.replaceAll( END_TAG, CoreServices.highlightTagEnd );
                  
            }
            System.out.println( "highlightMatchingPhrases.return: " + highlighted );
            return highlighted;
Using String's replaceAll to assign highlighted, I get:
hilitebuffer = ?C++?
highlighted= Java, ?C++?++, ?C++?, etc.
highlightMatchingPhrases.return: Java, <b style="color:black;background-color:#ffff66">C++</b>++, <b style="color:black;background-color:#ffff66">C++</b>, etc. Note the extra ++ and C++

Using Matcher's replaceAll, I get:
hilitebuffer = ?C++?
highlighted= Java,?C++?, C, etc.
highlightMatchingPhrases.return: Java,<b style="color:black;background-color:#ffff66">C++</b>, C, etc.
Problem here is very subtle, the space preceeding the text C++ has been replaced as well. So, if the text string was "rollerblading" and the search term was "blading" the resulting highlighted text would be rolleblading, missing the r. Not good.

Reproducible every time.

Ideas?

Thanks!

Ginni
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jun 20 2008
Added on May 23 2008
2 comments
289 views