hi everyone.
I recently did a test to compare behavior of NESTED LOOPS and HASH JOIN, I have read that optimizer prefers HASH JOIN when joining large tables and theres is enough memory on hash area.
EXPLAIN PLAN FOR
SELECT null prueba_nl_2
FROM TICKLER_RECORDS t, CONTR_SERVICES c
WHERE t.co_id = c.co_id
---------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 44G| 533G| 2456K|
| 1 | NESTED LOOPS | | 44G| 533G| 2456K|
|* 2 | INDEX FAST FULL SCAN| FKITCRCDOC | 3559M| 19G| 2456K|
|* 3 | INDEX RANGE SCAN | FKICSCOID | 12 | 84 | |
---------------------------------------------------------------------
alter session set optimizer_mode = 'choose'
EXPLAIN PLAN FOR
SELECT /*+ USE_HASH(t,c)*/null prueba_nl_2 -- I have to use a HINT because still choose NESTED LOOPS
FROM TICKLER_RECORDS t, CONTR_SERVICES c
WHERE t.co_id = c.co_id;
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 44G| 533G| | 31M|
|* 1 | HASH JOIN | | 44G| 533G| 29G| 31M|
| 2 | INDEX FAST FULL SCAN| FKICSCOID | 1681M| 10G| | 615K|
|* 3 | INDEX FAST FULL SCAN| FKITCRCDOC | 3559M| 19G| | 2456K|
-----------------------------------------------------------------------------
SELECT TABLE_NAME, NUM_ROWS, BLOCKS, LAST_ANALYZED
FROM all_tables
WHERE table_name IN ('TICKLER_RECORDS','CONTR_SERVICES');
TABLE_NAME NUM_ROWS BLOCKS LAST_ANALYZED
------------------------------ ---------- ---------- -------------
CONTR_SERVICES 1681394700 25388287 11/09/2014 06
TICKLER_RECORDS 4306071580 150498719 12/09/2014 07
SELECT table_name, column_name , num_distinct, density, num_nulls, last_analyzed
FROM all_tab_columns
WHERE column_name = 'CO_ID'
and table_name IN ('TICKLER_RECORDS','CONTR_SERVICES')
and owner = 'SYSADM';
TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS LAST_ANALYZED
------------------------------ ------------------------------ ------------ ---------- ---------- -------------
CONTR_SERVICES CO_ID 135915882 7,35749189 0 11/09/2014 06
TICKLER_RECORDS CO_ID 78117065 1,28013002 746559020 12/09/2014 07
select name, value from v$parameter
WHERE name IN (
'optimizer_mode' ,
'pga_aggregate_target',
'hash_join_enabled' ,
'workarea_size_policy' ,
'optimizer_index_cost_adj' ,
'optimizer_dynamic_sampling' ,
'db_file_multiblock_read_count' );
NAME VALUE
---------------------------------------------------------------- --------------------------------------------------------------------------------
db_file_multiblock_read_count 16
hash_join_enabled TRUE
optimizer_mode CHOOSE
optimizer_index_cost_adj 10
pga_aggregate_target 125829120
workarea_size_policy AUTO
optimizer_dynamic_sampling 1
When runnin the NESTED_LOOPS statement rows are displayed almost immediatly, but when use HASH JOIN i have to cancel operation because never end.
I really tought Oracle go faster on HASH JOIN, but here is not the case.
i am not an expert, not a DBA, just a PL developer testing what i have read.
Best Regards.