RBDMS software : 11.2.0.4 with July 2016 PSU
Operating System: RHEL 6.7
I am intermediate user of Oracle SQL.
S_ASSET table is 615 GB in size. NAT_IMSIICCID_CPROCE_LOAD is only 13MB in size.
The results from NAT_IMSIICCID_CPROCE_LOAD table (inner query) will be used to fetch records from S_ASSET table (outer query).
For scenarios like this, where INNER query has very less records than the outer query, I have been told that EXISTS will perform better than the query which uses IN operator
But, when I tested, the query which uses IN operator finished executing in 4 minutes and 55 seconds
and
the query which uses EXISTS was still running after 25 minutes and I had to cancel the execution !!!!
What could be the reason for this seeimingly weird behaviour ?
==== Query using IN =====
SQL> select /*+ GATHER_PLAN_STATISTICS */ count(*) from S_ASSET where row_id IN (select msisdn from NAT_IMSIICCID_CPROCE_LOAD) ;
COUNT(*)
----------
426510
Elapsed: 00:04:55.46
Execution Plan:
-------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:04:55.28 | 1281K| 174K| | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:04:55.28 | 1281K| 174K| | | |
| 2 | NESTED LOOPS | | 1 | 133K| 426K|00:04:57.41 | 1281K| 174K| | | |
| 3 | VIEW | VW_NSO_1 | 1 | 133K| 426K|00:00:01.18 | 1578 | 1452 | | | |
| 4 | HASH UNIQUE | | 1 | 133K| 426K|00:00:01.00 | 1578 | 1452 | 23M| 4152K| 26M (0)|
| 5 | TABLE ACCESS FULL| NAT_IMSIICCID_CPROCE_LOAD | 1 | 133K| 426K|00:00:00.07 | 1578 | 1452 | | | |
|* 6 | INDEX UNIQUE SCAN | S_ASSET_P1 | 426K| 1 | 426K|00:04:53.67 | 1279K| 172K| | | |
-------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("ROW_ID"="MSISDN")
Note
-----
- dynamic sampling used for this statement (level=1)
==== Same query using EXISTS =====
SQL> select /*+ GATHER_PLAN_STATISTICS */ count(*) from S_ASSET outer
where exists (select 1 from NAT_IMSIICCID_CPROCE_LOAD inner where outer.row_id = inner.msisdn);
I had to cancel the above query after 25 minutes. I hope the query on DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>'allstats last') is accurate (below output)
--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 0 | 0 |
| 1 | SORT AGGREGATE | | 1 | 1 | 0 |00:00:00.01 | 0 | 0 |
|* 2 | INDEX FULL SCAN | S_ASSET_P1 | 1 | 28M| 1 |00:10:52.19 | 49M| 127 |
|* 3 | TABLE ACCESS FULL| NAT_IMSIICCID_CPROCE_LOAD | 43583 | 3844 | 1 |00:15:01.88 | 68M| 0 |
--------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( IS NOT NULL)
3 - filter("INNER"."MSISDN"=:B1)
Note
-----
- dynamic sampling used for this statement (level=1)