Folks,
Is Doc Jam in da' house?
I'm trying to remove duplicate points (ie: zero-length-lines) from the Well Known Text representation of various geometries.
I think this code is "fairly close" to what I'm after... but no bananas yet
import java.util.regex.Pattern;
import java.util.regex.Matcher;
class KrcHarness
{
/**
* removeDuplicatePointsFromWkt
*
* @param actualWkt String
* @return String
*/
private static String removeDuplicatePointsFromWkt(String wkt) {
Pattern pattern = Pattern.compile("(\\d+(\\.\\d+)? -\\d+(\\.\\d+)?( \\d+(\\.\\d+)?)?,)\\1");
while(true) {
Matcher matcher = pattern.matcher(wkt);
if(!matcher.find()) break;
String group = matcher.group();
String replacement = group.substring(0,group.indexOf(",")+1);
wkt = matcher.replaceAll(replacement);
System.out.println("DEBUG: group=\""+group+"\", replacement=\""+replacement+"\", wkt=\""+wkt+"\"");
}
return wkt;
}
public static void main(String[] args) {
String actual = "MULTIPOLYGON (((150.4 -15,150.4 -15.45,150 -15.05,150 -15.05,150 -15,150 -15,150.35 -15.35,150.35 -15,150.35 -15,150.4 -15)))";
System.out.println("actual : "+actual);
System.out.println("clean : "+removeDuplicatePointsFromWkt(actual));
}
}
I get the output
actual : MULTIPOLYGON (((150.4 -15,150.4 -15.45,150 -15.05,150 -15.05,150 -15,150 -15,150.35 -15.35,150.35 -15,150.35 -15,150.4 -15)))
DEBUG: group="150 -15.05,150 -15.05,", replacement="150 -15.05,", wkt="MULTIPOLYGON (((150.4 -15,150.4 -15.45,150 -15.05,150 -15.05,150.35 -15.35,150 -15.05,150.4 -15)))"
DEBUG: group="150 -15.05,150 -15.05,", replacement="150 -15.05,", wkt="MULTIPOLYGON (((150.4 -15,150.4 -15.45,150 -15.05,150.35 -15.35,150 -15.05,150.4 -15)))"
clean : MULTIPOLYGON (((150.4 -15,150.4 -15.45,150 -15.05,150.35 -15.35,150 -15.05,150.4 -15)))
I need the output
Actual : MULTIPOLYGON (((150.4 -15,150.4 -15.45,150 -15.05,150 -15.05,150 -15,150 -15,150.35 -15.35,150.35 -15,150.35 -15,150.4 -15)))
===================== =============== =====================
manual : MULTIPOLYGON (((150.4 -15,150.4 -15.45,150 -15.05,150 -15,150.35 -15.35,150.35 -15,150.4 -15)))
I don't understand why it's find the same string twice, and why it doesn't find the second & third repeated-group. I must be missing something basic.
Please Does anyone know how to remove a series of repetitions from a string?
Thanx for any help... Keith.