Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Traversing a document with jtidy

843834Dec 18 2008 — edited Dec 18 2008

Hello,

I am trying to parse an xml page with jtidy. I am having trouble traversing the whole document. I thought I found a solution online, but it throws an error. Here is my code so far (this is a the doGet method of the servlet):

PrintWriter pw = response.getWriter();
		String param = request.getParameter("url");
		URL url = new URL(param);
		Tidy t = new Tidy();

		HttpURLConnection u = (HttpURLConnection)url.openConnection();
		u.connect();

		Document page = t.parseDOM(u.getInputStream(), null);
		DocumentTraversal dt = (DocumentTraversal)page;
		NodeIterator ni = dt.createNodeIterator(page.getDocumentElement(), NodeFilter.SHOW_ELEMENT, null, true);

		for(Node n = ni.nextNode(); n!=null; n=ni.nextNode())
		{
			pw.print(n.getNodeName());
		}

Here is the error when I try to run the code:

java.lang.ClassCastException: org.w3c.tidy.DOMDocumentImpl cannot be cast to org.w3c.dom.traversal.DocumentTraversal
	proxy.doGet(proxy.java:36)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:627)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:729)

Anyone have any experience with this?

Locked Post

New comments cannot be posted to this locked post.

Locked on Jan 15 2009

Added on Dec 18 2008

#java-technology-xml

4 comments

440 views