New to Java

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Confused about Tokenizing with Regex

807601Jun 24 2008 — edited Jun 25 2008

Im using a regex pattern to tokenize a String.
The code runs fine but Im curious about the output.
Here's the code:

public class Test2 {
    	public static void main(String[] args) {
    			String[] tokens = args[1].split(args[0]);
    			for(String s : tokens)
    				System.out.println("Token: >"+s+"<");
    	}
    }

My code prints brackets around the output to allow for whitespaces.
Here is my command line invocation where args[0] is the regex pattern to be used and args[1] is the source String:
java Test2 "\d*" "cY 39r k"
The output was:

Token: ><
Token: >c<
Token: >Y<
Token: > <
Token: ><
Token: >r<
Token: > <
Token: >k<

Am I right in saying, that at cell 0, a 'c' resides, which is a delimiter as it is not a digit so an empty String >< is printed. Cell 1 contains 'Y' which is a delimiter as it is not a digit, so >c< is printed. Then in cell 2 a whitespace resides, which is not a digit, so it therefore counts as a delimiter. but why isn't >cY< printed? Here it prints a whitespace > < which is the delimiter. I would have thought >cY< would be printed.
I read the Java tutorial on searching using Regex and if it was a search I can understand that (off the top of my head) the output would be:
"" @ start index 0 and end index 0
"" @ start index 1 end index 1
"" @ start 2 end 2
39 @ start 3 end 5
"" @ start 5 end 5
"" @ start 6 end 6
"" @ start 7 end 7
"" @ start 8 end 8

I just dont understand what's going on when using the above regex expression as a delimiter when tokenizing.
Please help!
Thank you

Locked Post

New comments cannot be posted to this locked post.

Locked on Jul 23 2008

Added on Jun 24 2008

7 comments

209 views