Hi,
I have some huge documents (~5GB) and I use Stax to read them.
My problem: I want to load only a part of the document.
I know the location that I should put the inputStream, so I skip half of the file.
Then I push data using xmlReader.hasNext(). After the first iteration though, I get the exception ->
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[34,4]
Message: The markup in the document following the root element must be well-formed.
The original xml is like that:
<root>
<element id=1>
</element>
<element id=2>
</element>
<element id=3>
</element>
</root>
And I pass to the xmlStreamReader
<element id=2>
</element>
<element id=3>
</element>
So, I know why I get it. Because I include in the input stream only a part.
When it tries to read the element with id=3 , it says not well formed document.
which on one hand is correct, but on the other hand not important for me.
any possible solutions? How to disable the check of xmlstream reader or I don't what.
no, I cannot wrap a part of a 5Gb file to something else...That's not the point. It will be to slow...
That why I want to skip so much data in first place, to make it quick.
The problem is so annoying and a little bit stupid.
A solution would be to write my own parser, instead of using the XMLStreamReader, but then again, this is stupid, dirty, and duplicate of efforts...
-------part of the code--------
FileInputStream inputStream = new FileInputStream(filename);
inputStream.skip(skipBytes);
xmlReader = xmlif.createXMLStreamReader(filename, inputStream);
while (xmlReader.hasNext() && parsingComplete == false) {
xmlReader.next();
if (xmlReader.isStartElement()) {
parseStartElement(xmlReader);
continue;
}
}
Thanks for the help and any opinions.
Andreas