Counting rows in JDBC
409231Nov 17 2003 — edited Nov 19 2003I've recently come across the problem of needing to count the number of rows in a potentially large resultset.
I did quite a bit of digging and found that the only suggested answers appear to be:
1. Do a SELECT COUNT(*) before doing the query.
2. Make the ResultSet scrollable, and use rset.last() and rset.getRow().
I have problems with both approaches:
1a. Two queries are issued rather than one - this could be quite expensive.
1b. The queries I am executing are dynamic - thus some clever logic is required to parse the statement and work out what we should be counting.
1c. The number of hits may change between the counting and the actual query.
2a. Definitely most simple, but the Oracle JDBC driver iterates through all the rows to get to the last one, meaning that the whole ResultSet is loaded in to the client-side cache, and thus killing the middle-tier unnecessarily.
2b. The results have to be scrollable - not a problem for me, but is too expensive for some.
I had settled on number 2 for a while, but performance was just too nasty when the ResultSet had more than a few million rows in it, and thus I've spent the last few days looking around for better alternatives.
select inner_2.*, rownum rowcount from (
select inner_1.*, (rownum*-1) neg_row, rownum pos_row from (
select COL1, COL2, COL3
from TABLE1, TABLE2
where TABLE1.COL1 = TABLE2.COL2
order by event_type
) inner_1
order by neg_row asc
) inner_2
order by pos_row asc
Let me explain. This query is determining the rowcount and returning the required columns all in one statement. All I need to do is look at the first row, and read the ROWCOUNT column (the last column) to know how many rows are in the resultset.
The disadvantage of this method is that an extra two sorts are required, and that there are three extra columns on the result set, which the caller didn't ask for. Hiding the extra three column is easy if you want a really clean solution.
These are definitely prices worth paying for me but I acknowledge that not everyone will find this price worth paying. However, having been immensely frustated with the lack of options for solving this problem, I thought I would share my solution with you all.
And on a less-altruistic note, if anyone can tell me how to optimise it further, has a better solution, or knows why it doesn't actually work like it appears to, I'd be more than interested to hear your comments.
Marcus