Hi,
I'm trying to parse an html website to a w3c document, but I'm getting the following exception:
java.net.MalformedURLException: no protocol
The code is:
import java.io.BufferedInputStream;
import java.io.DataInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
.
.
.
URL u;
InputStream is = null;
DataInputStream dis;
String s;
StringBuffer xmlFeed = new StringBuffer();
try {
u = new URL("http://www.google.com");
is = u.openStream();
dis = new DataInputStream(new BufferedInputStream(is));
while ((s = dis.readLine()) != null) {
xmlFeed.append(s);
} catch (Exception ex) {
System.out.println("no good");
}
// So far so good.....
try {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilderFactory.setIgnoringComments(true);
docBuilderFactory.setValidating(false);
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(xmlFeed.toString()); // The exception is caught here.....
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
Can anyone offer some assistance?