Regex help?
807569Sep 15 2006 — edited Sep 17 2006Hi everybody,
From what I've been told, it would be best to do my StringTokenizer Problem with regex, but I have no idea how to even start. Can anyone help me out?
What I'd like to do is split a line of text I read in so it gets rid of all punctuation. However, I want to be able to keep accented characters in the words if they appear.
Also, I want to keep apostrophes in the words (ascii code 0039 ) but not open or closed single quotes ( codes 0145 and 0146 ), and I want to keep hyphens ( code 0045 ) but not long dashes ( code 0151 ).
Eventually, I want to be able to put each match (word) into a list; is there any way to do that as the matches are found?
I know this is a tough problem, any help would be great!
Jezzica85