This is a cross posting as I didn't realise we had a specific XML forum, question began here http://forums.sun.com/thread.jspa?messageID=10631949
Dear all,
I've just discovered a major bottleneck in my project regarding the use of XPath queries to search for XML attributes. I have XML documents contains approximately 1000 nodes of the same type. I use an XPath query to find all the nodes and then loop through with XPath queries to read the attribute values, focused on the node. e.g. (WARNING: pseudocode, not Java)
entities = xpath.search("MyNode", xml);
for (entity : entities){
// NOTE: the XPath search is on the 'entity', not on the whole document
attr1 = xpath.search("@attr1", entity);
attr2 = xpath.search("@attr2", entity);
attr3 = xpath.search("@attr3", entity);
attr4 = xpath.search("@attr4", entity);
}
I noticed that this is taking approximately 30 seconds for 1,000 nodes... even if the XPath queries are precompiled!
If I rewrite this as (again, pseudocode)
entities = xpath.search("MyNode", xml);
for (entity : entities){
attrs = entity.getAttributes();
attr1 = attrs.getNamedItem("attr1").getNodeValue();
attr2 = attrs.getNamedItem("attr2").getNodeValue();
attr3 = attrs.getNamedItem("attr3").getNodeValue();
attr4 = attrs.getNamedItem("attr4").getNodeValue();
}
then the code executes almost instantly for 1,000 nodes! Obviously I'm now using the latter solution but I want to understand why XPath is so unbelievably inefficient here! Is my query sub-optimal?
From the other thread, it has come to light that the query may be stepping outside of the node it has been invoked on and creating a more complete result list... is there any way to restrict to the 'entity' Node and not allow the query to go into the parent, siblings or children?