Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Is rank() really better than rownum for top-n-queries?

UW (Germany)May 21 2010 — edited May 25 2010
Several sources say that for Oracle databases rank() should be used instead of 'rownum <= n' for top-n-queries. But here we have an application, where we a lot of top-n queries are executed on a big table with several million rows and rank() has a quite bad performance. I get much better results when I use a query with rownum <= n but the programmer of the application doesn't want to change it in the software because of those articles about rank() and rownum. I wonder, whether it is possible, to find a better form of the rank()-query or an additional index, that gives me the same performance.

To explain my case I created the following example (if you try it, be aware that depending on the size of your dba_objects view you might need up to half a gig free space in your tablespace for this example).

create table big_objects
as
select
ascii(m.alpha)*100000+o.object_id object_id,
o.owner owner,
o.object_type,
m.alpha||'_'||o.object_name object_name,
sysdate-400+mod(100*object_id+99*ascii(m.alpha),365)+24/(o.object_id+ascii(m.alpha)) created,
o.status
from
(select distinct
upper(substr(object_name,1,1)) alpha
from
sys.dba_objects
where
upper(substr(object_name,1,1)) between 'A' and 'Z') m,
sys.dba_objects o
order by
object_name;

create index bigindex_1 on big_objects (owner, object_type, created);

analyze table big_objects compute statistics;

So my table looks a bit like dba_objects but with much more rows and I made a synthetic "created" date which is more similar to my real case, where top-n means a date selection of the newest records from a certain type.

Here is the size of the segments on an nearly empty 11gR2 database:

select segment_name, bytes, blocks from sys.dba_segments where segment_name like 'BIG%'

SEGMENT_NAME BYTES BLOCKS
-------------- ---------- ----------
BIGINDEX_1 75497472 9216
BIG_OBJECTS 142606336 17408

On my database the example table has approx. 1,9 Mio rows:

select count(*) from big_objects;

COUNT(*)
----------
1884246

and some 1,4% of those rows have owner = 'SYS' and object_type = 'INDEX'

select
count(*)
from big_objects
where owner = 'SYS'
and object_type = 'INDEX';

COUNT(*)
----------
25896

But I want to find only the 10 newest indexes for the owner SYS. I think the typical rank() approach would be:

select
owner,
object_type,
object_name,
object_id,
status,
created
from
( select
owner,
object_type,
object_name,
object_id,
status,
created,
rank() over (order by created desc) rnk
from
big_objects
where
owner = 'SYS'
and object_type = 'INDEX')
where rnk <= 10
order by created asc;

OWNER OBJECT_TYPE OBJECT_NAME OBJECT_ID STATUS CREATED
------------------------------ ------------------- -------------------------------- ---------- ------- --------------------
SYS INDEX B_COLLELEMIND 6600515 VALID 15.04.2010 19:05:55
SYS INDEX V_I_WRI$_OPTSTAT_IND_OBJ#_ST 8600466 VALID 15.04.2010 19:09:03
SYS INDEX G_I_RLS 7100375 VALID 15.04.2010 19:23:55
SYS INDEX V_I_DIR$SERVICE_UI 8600320 VALID 15.04.2010 19:31:33
SYS INDEX L_I_TSM_DST2$ 7600308 VALID 15.04.2010 19:36:26
SYS INDEX L_I_IDL_UB11 7600235 VALID 15.04.2010 19:57:34
SYS INDEX V_I_VIEWTRCOL1 8600174 VALID 15.04.2010 20:19:21
SYS INDEX L_I_TRIGGER2 7600162 VALID 15.04.2010 20:31:39
SYS INDEX L_I_NTAB1 7600089 VALID 15.04.2010 21:35:53
SYS INDEX B_I_SYN1 6600077 VALID 15.04.2010 22:08:07

10 rows selected.

Elapsed: 00:00:00.22

Execution Plan
----------------------------------------------------------
Plan hash value: 2911012437

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1427 | 188K| 1400 (1)| 00:00:17 |
| 1 | SORT ORDER BY | | 1427 | 188K| 1400 (1)| 00:00:17 |
|* 2 | VIEW | | 1427 | 188K| 1399 (1)| 00:00:17 |
|* 3 | WINDOW SORT PUSHED RANK | | 1427 | 79912 | 1399 (1)| 00:00:17 |
| 4 | TABLE ACCESS BY INDEX ROWID| BIG_OBJECTS | 1427 | 79912 | 1398 (0)| 00:00:17 |
|* 5 | INDEX RANGE SCAN | BIGINDEX_1 | 1427 | | 9 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("RNK"<=10)
3 - filter(RANK() OVER ( ORDER BY INTERNAL_FUNCTION("CREATED") DESC )<=10)
5 - access("OWNER"='SYS' AND "OBJECT_TYPE"='INDEX')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
25870 consistent gets
0 physical reads
0 redo size
1281 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
10 rows processed

As from the index only the first two columns are used, all the 25896 records that I found above are read and sorted just to find the ten newest ones. Many unnecessary blocks are read and luckily all needed database blocks were in memory already. In our real case quite often a lot of physical reads are performed, which makes the performance of the application even worse.

In my following example with a "rownum <= 10" all three columns of the index are used and the number of block gets is much, much smaller than in the rank() example.

select
owner,
object_type,
object_name,
object_id,
status,
created
from
big_objects
where
(owner, object_type, created)
in
( select
owner,
object_type,
created
from
( select /*+ first_rows(10) */
owner,
object_type,
created
from
big_objects
where
owner = 'SYS'
and object_type = 'INDEX'
order by
owner,
object_type,
created desc
)
where rownum <= 10
)
order by created asc;

OWNER OBJECT_TYPE OBJECT_NAME OBJECT_ID STATUS CREATED
------------------------------ ------------------- -------------------------------- ---------- ------- --------------------
SYS INDEX B_COLLELEMIND 6600515 VALID 15.04.2010 19:05:55
SYS INDEX V_I_WRI$_OPTSTAT_IND_OBJ#_ST 8600466 VALID 15.04.2010 19:09:03
SYS INDEX G_I_RLS 7100375 VALID 15.04.2010 19:23:55
SYS INDEX V_I_DIR$SERVICE_UI 8600320 VALID 15.04.2010 19:31:33
SYS INDEX L_I_TSM_DST2$ 7600308 VALID 15.04.2010 19:36:26
SYS INDEX L_I_IDL_UB11 7600235 VALID 15.04.2010 19:57:34
SYS INDEX V_I_VIEWTRCOL1 8600174 VALID 15.04.2010 20:19:21
SYS INDEX L_I_TRIGGER2 7600162 VALID 15.04.2010 20:31:39
SYS INDEX L_I_NTAB1 7600089 VALID 15.04.2010 21:35:53
SYS INDEX B_I_SYN1 6600077 VALID 15.04.2010 22:08:07

10 rows selected.

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 3360237620

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 760 | 17 (0)| 00:00:01 |
| 1 | SORT ORDER BY | | 10 | 760 | 17 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 10 | 760 | 17 (0)| 00:00:01 |
| 4 | VIEW | VW_NSO_1 | 10 | 200 | 2 (50)| 00:00:01 |
| 5 | HASH UNIQUE | | 10 | 200 | 4 (25)| 00:00:01 |
|* 6 | COUNT STOPKEY | | | | | |
| 7 | VIEW | | 11 | 220 | 3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN DESCENDING| BIGINDEX_1 | 1427 | 28540 | 3 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | BIGINDEX_1 | 3 | | 2 (0)| 00:00:01 |
| 10 | TABLE ACCESS BY INDEX ROWID | BIG_OBJECTS | 3 | 168 | 6 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - filter(ROWNUM<=10)
8 - access("OWNER"='SYS' AND "OBJECT_TYPE"='INDEX')
9 - access("OWNER"="OWNER" AND "OBJECT_TYPE"="OBJECT_TYPE" AND "CREATED"="CREATED")


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
26 consistent gets
0 physical reads
0 redo size
1281 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
10 rows processed

I made this comparison with the Oracle versions 10.2 and 11.2 and the result was more or less the same. How can I change the rank() query in a way that only the small number of really needed blocks are read from the database?
This post has been answered by Ramesh Eega on May 21 2010
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jun 22 2010
Added on May 21 2010
11 comments
2,976 views