Java's RegEx
807606May 7 2007 — edited May 10 2007I am trying to identify a certain set of characters that lie outside normal english language and a small set of punctuation chars. I used the code below to try to accomplish this task. However, the reg ex does not accomplish the intended task. Can anyone identify what the source of the problem is?
String aTestString = "My, Telecci������n, Inc.\"#$%&\'()+,-./:;<=>?@[\\]^_`{|}~*";
String filter = "[^\\w\\s[\\p{Punct}&&[^$%=+*]]]";
Pattern filterPattern = Pattern.compile( filter );
Matcher filterMatcher = filterPattern.matcher( aTestString );
if ( filterMatcher.find( ) )
{
System.out.println( "found err char: " + filterMatcher.group() + " << ");
}
The output is > found err char: , <<
whereas, I am expecting > found err char: � <<
I would appreciate any ideas. Thank you.