Skip to Main Content

Java Programming

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Problems parsing text addresses

807607Oct 27 2006 — edited Oct 27 2006
Hi, I'm having a problem thinking through this logically and could sure use some help.

I've got addresses, but the address elements are variable, meaning I could have any or all of the following address elements

Business name
Address 1
Address 2
City/St/Zip
County

The problem is that I don't necessarily have all those elements every time. Sometimes it's Biz name, Addr1, csz, county. Sometimes it's Addr1, Addr2, csz. Some details are that there is ALWAYS at least an addr1 and csz, and addr1, addr2, and csz are always in that order. Other than that, it's anything goes.

I've written an if...else if statement that seemed to work, but then I ran into situations that crashed the algorithm. I explain the situations below the code.

I'm just having problems getting my brain around this problem and could use some help please.

String test = "Lake Mary Primary Care, LLC<br />4106 W. Lake Mary Boulevard<br />Suite 100<br />Lake Mary, FL  32746<br />Seminole County<br />";

// other possibilities
// String test = "Lake Mary Primary Care, LLC<br />4106 W. Lake Mary Boulevard<br />Suite 100<br />Lake Mary, FL  32746<br />";
// String test = "Lake Mary Primary Care, LLC<br />4106 W. Lake Mary Boulevard<br />#100<br />Lake Mary, FL  32746<br />Seminole County<br />";
// String test = "4106 W. Lake Mary Boulevard<br />Suite 100<br />Lake Mary, FL  32746<br />Seminole County<br />";


String [] tA = test.split("<br />");

String addrcsz = "", addr = "", addr2 = "", csz = "", city = "", st = "", zipcode = "";

// flag indicating address1 already detected, and csz already detected
boolean a = false, c = false;


// if it starts with a number, then it is most likely addr1
if (isNumeric(tA[x].substring(0, 1))) {
	System.out.println("This is an address");
	// flag addr1 found
	a = true;
	// assign to addr
	addr = tA[x];

// if last 5 digits are numbers, then it's most likely a csz
} else if (isNumeric(tA[x].substring(tA[x].length() - 5, tA[x].length()))) {
	System.out.println("This is a csz");
	// flag csz found
	c = true;

	// split up the csz into city, state, and zip
	csz = tA[x].replaceAll("  ", " ");
	if (csz.substring(csz.length() - 5, csz.length() - 4).equalsIgnoreCase("-")) {
		zipcode = csz.substring(csz.length() - 10, csz.length());
		city = csz.substring(0, csz.indexOf(","));
		st = csz.substring(csz.length() - 13, csz.length() - 11);
	} else {
		zipcode = csz.substring(csz.length() - 5, csz.length());
		city = csz.substring(0, csz.indexOf(","));
		st = csz.substring(csz.length() - 8, csz.length() - 6);
	}

// if element is between addr1 and csz, then it is addr2
} else if (a && !c) {
	System.out.println("This is address2");
	// assign to addr2
	addr2 = tA[x];

} else {
	System.out.println("Not address or csz");
}





static public boolean isNumeric(String string) {
     	return string.matches("^[-+]?\\d+(\\.\\d+)?$");
}
One problem occurs while I'm testing for numeric. For instance, if I'm given a "#100" as the addr2, then when I go to test for the zipcode (last 5 digits numeric), then it crashes because "#100" is not 5 digits.

So I guess I just don't know how to pre-validate and test for things like spaces and numeric values BEFORE I run it through this if... else if algorithm. Does that make sense?? I'm sure there is probably a better solution, so I'm willing to think outside the box and entertain a whole different approach.

Any help is GREATLY appreciated. Thank you.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Nov 24 2006
Added on Oct 27 2006
14 comments
121 views