Skip to Main Content

Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

XPath attribute lookup inefficient

fommilMar 5 2009 — edited Nov 21 2009
This is a cross posting as I didn't realise we had a specific XML forum, question began here http://forums.sun.com/thread.jspa?messageID=10631949

Dear all,

I've just discovered a major bottleneck in my project regarding the use of XPath queries to search for XML attributes. I have XML documents contains approximately 1000 nodes of the same type. I use an XPath query to find all the nodes and then loop through with XPath queries to read the attribute values, focused on the node. e.g. (WARNING: pseudocode, not Java)
entities = xpath.search("MyNode", xml);
for (entity : entities){
  // NOTE: the XPath search is on the 'entity', not on the whole document
  attr1 = xpath.search("@attr1", entity);
  attr2 = xpath.search("@attr2", entity);
  attr3 = xpath.search("@attr3", entity);
  attr4 = xpath.search("@attr4", entity);
}
I noticed that this is taking approximately 30 seconds for 1,000 nodes... even if the XPath queries are precompiled!

If I rewrite this as (again, pseudocode)
entities = xpath.search("MyNode", xml);
for (entity : entities){
  attrs = entity.getAttributes();
  attr1 = attrs.getNamedItem("attr1").getNodeValue();
  attr2 = attrs.getNamedItem("attr2").getNodeValue();
  attr3 = attrs.getNamedItem("attr3").getNodeValue();
  attr4 = attrs.getNamedItem("attr4").getNodeValue();
}
then the code executes almost instantly for 1,000 nodes! Obviously I'm now using the latter solution but I want to understand why XPath is so unbelievably inefficient here! Is my query sub-optimal?

From the other thread, it has come to light that the query may be stepping outside of the node it has been invoked on and creating a more complete result list... is there any way to restrict to the 'entity' Node and not allow the query to go into the parent, siblings or children?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Dec 19 2009
Added on Mar 5 2009
27 comments
633 views