OS
Microsoft Windows Server 2003 R2 (32-bit)
DB Version
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for 32-bit Windows: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
Preface
I got this query dropped on my desk today looking for help to improve performance. :)
I've also replaced table names, column names, and other values as well, so it may appear a little cryptic at first.
I believe I've located the problem (see below), but I'm at a loss on the best way to fix it.
Query
WITH latest_dattpoints AS
(
SELECT tere.dattpoint_id
, tere.datatype
, tere.sortkey
FROM dattpoint_revision tere
JOIN crd_revision mrre ON mrre.crd_revision_id = tere.crd_revision_id
WHERE mrre.effective_date = ( SELECT MAX(mrre2.effective_date)
FROM crd_revision mrre2
WHERE mrre2.crd_id = mrre.crd_id
)
), latest_datpoints AS
(
SELECT setp.datpoint_id
, setp.dattpoint_id
, setp.unit_class_id
, setp.relat_id
, sere.value
, sere.operator
, sere.color
FROM datpoint setp
JOIN datpoint_revision sere ON sere.datpoint_id = setp.datpoint_id
WHERE sere.effective_date = ( SELECT MAX(sere2.effective_date)
FROM datpoint_revision sere2
WHERE sere2.datpoint_id = sere.datpoint_id
)
), latest_oreps_per_comps AS
( SELECT comp.comps_id
, MAX(orep.orep_id) KEEP (DENSE_RANK LAST ORDER BY orep.orepdate NULLS FIRST) AS orep_id
, MAX(orep.orepdate) AS last_orep_date
FROM comps comp
JOIN datareview insp ON insp.comps_id = comp.comps_id
JOIN orep orep ON orep.datareview_id = insp.datareview_id
GROUP BY comp.comps_id
)
SELECT /*+ gather_plan_statistics */
data.data_id
, latest_datpoints.datpoint_id
, datareview.datareview_id
, latest_dattpoints.dattpoint_id
, data.data
, latest_datpoints.operator
, latest_datpoints.value
, latest_datpoints.color
, latest_dattpoints.datatype
, latest_dattpoints.sortkey
, datareview.completedate
, comps.comps_id
, loc.loc_id
, crd_division.division_id
, unit.unit_id
, unit.squadron_id
, relat.unit_class_id
, relat.mip_id
, relat.crd_id
, usr_crd.usr_id
, latest_oreps_per_comps.orep_id
, latest_oreps_per_comps.last_orep_date
FROM datareview
JOIN comps ON datareview.comps_id = comps.comps_id
JOIN unit ON comps.unit_id = unit.unit_id
JOIN relat ON comps.relat_id = relat.relat_id
JOIN data ON datareview.datareview_id = data.datareview_id
JOIN latest_dattpoints ON latest_dattpoints.dattpoint_id = data.dattpoint_id
JOIN latest_datpoints ON latest_datpoints.dattpoint_id = latest_dattpoints.dattpoint_id
LEFT JOIN latest_oreps_per_comps ON latest_oreps_per_comps.comps_id = comps.comps_id
LEFT JOIN usr_crd ON relat.crd_id = usr_crd.crd_id
AND relat.unit_class_id = usr_crd.unit_class_id
LEFT JOIN perio ON comps.unit_id = perio.unit_id
AND datareview.completedate BETWEEN perio.startdate AND perio.enddate
LEFT JOIN loc ON perio.loc_id = loc.loc_id
OR ( perio.perio_id IS NULL
AND unit.locahome = loc.loc_id
)
LEFT JOIN crd_division ON loc.loc_id = crd_division.loc_id
AND relat.crd_id = crd_division.crd_id
WHERE ( latest_datpoints.unit_class_id IS NULL
OR latest_datpoints.unit_class_id = relat.unit_class_id
)
AND ( latest_datpoints.relat_id IS NULL
OR latest_datpoints.relat_id = comps.relat_id
)
AND (
( data.is_numeric = 1
AND latest_dattpoints.datatype IN ('I','R')
AND (
( latest_datpoints.operator = 'LT'
AND TO_NUMBER(data.data) < TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'LTE'
AND TO_NUMBER(data.data) <= TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'GTE'
AND TO_NUMBER(data.data) >= TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'GT'
AND TO_NUMBER(data.data) > TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'EQ'
AND TO_NUMBER(data.data) = TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'NEQ'
AND TO_NUMBER(data.data) != TO_NUMBER(latest_datpoints.value)
)
)
)
OR ( data.is_numeric = 0
AND latest_dattpoints.datatype IN ('O')
AND (
( latest_datpoints.operator = 'EQ'
AND data.data = latest_datpoints.value
)
OR ( latest_datpoints.operator = 'NEQ'
AND data.data != latest_datpoints.value
)
)
)
)
Execution Plan
Plan hash value: 3808651332
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 102K|00:06:14.00 | 1684K| 76215 | 42952 | | | | |
| 1 | NESTED LOOPS OUTER | | 1 | 1 | 102K|00:06:14.00 | 1684K| 76215 | 42952 | | | | |
| 2 | VIEW | | 1 | 1 | 102K|00:06:11.36 | 1479K| 76215 | 42952 | | | | |
| 3 | NESTED LOOPS OUTER | | 1 | 1 | 102K|00:06:11.06 | 1479K| 76215 | 42952 | | | | |
| 4 | NESTED LOOPS OUTER | | 1 | 1 | 102K|00:06:08.64 | 1466K| 76215 | 42952 | | | | |
| 5 | VIEW | | 1 | 1 | 100K|00:05:58.88 | 124K| 76215 | 42952 | | | | |
|* 6 | FILTER | | 1 | | 100K|00:05:58.57 | 124K| 76215 | 42952 | | | | |
|* 7 | HASH JOIN OUTER | | 1 | 4362 | 795K|00:05:55.38 | 44269 | 76215 | 42952 | 118M| 7538K| 3508K (1)| 113K|
|* 8 | HASH JOIN RIGHT OUTER | | 1 | 4362 | 795K|00:05:49.48 | 40336 | 62168 | 28905 | 888K| 888K| 1337K (0)| |
| 9 | TABLE ACCESS FULL | USR_CRD | 1 | 7419 | 7419 |00:00:00.01 | 33 | 0 | 0 | | | | |
|* 10 | HASH JOIN | | 1 | 4362 | 795K|00:05:47.45 | 40303 | 62168 | 28905 | 848K| 848K| 1354K (0)| |
| 11 | TABLE ACCESS FULL | CRD_REVISION | 1 | 4385 | 4385 |00:00:00.01 | 65 | 0 | 0 | | | | |
|* 12 | HASH JOIN | | 1 | 4362 | 795K|00:05:45.44 | 40238 | 62168 | 28905 | 921K| 921K| 1225K (0)| |
| 13 | TABLE ACCESS FULL | UNIT | 1 | 168 | 168 |00:00:00.01 | 3 | 0 | 0 | | | | |
|* 14 | HASH JOIN | | 1 | 4362 | 795K|00:05:43.52 | 40235 | 62168 | 28905 | 919K| 919K| 1353K (0)| |
| 15 | TABLE ACCESS FULL | RELAT | 1 | 7860 | 7860 |00:00:00.01 | 65 | 0 | 0 | | | | |
|* 16 | HASH JOIN | | 1 | 4411 | 808K|00:05:43.96 | 40170 | 62168 | 28905 | 88M| 7674K| 3033K (1)| 87040 |
|* 17 | HASH JOIN | | 1 | 4411 | 808K|00:05:42.75 | 39511 | 52028 | 18765 | 77M| 7759K| 3036K (1)| 75776 |
|* 18 | HASH JOIN | | 1 | 4411 | 808K|00:08:32.44 | 36173 | 43253 | 9990 | 9216K| 1980K| 8459K (1)| 86016 |
|* 19 | HASH JOIN | | 1 | 88534 | 109K|00:00:00.41 | 2905 | 0 | 0 | 2175K| 1055K| 2952K (0)| |
| 20 | TABLE ACCESS FULL | DATPOINT_REVISION | 1 | 34319 | 34319 |00:00:00.02 | 324 | 0 | 0 | | | | |
|* 21 | HASH JOIN | | 1 | 81154 | 96329 |00:00:00.26 | 2581 | 0 | 0 | 1610K| 1171K| 1978K (0)| |
| 22 | TABLE ACCESS FULL | DATPOINT | 1 | 31416 | 31416 |00:00:00.02 | 144 | 0 | 0 | | | | |
| 23 | TABLE ACCESS FULL | DATTPOINT_REVISION | 1 | 235K| 235K|00:00:00.14 | 2437 | 0 | 0 | | | | |
| 24 | TABLE ACCESS FULL | DATA | 1 | 5917K| 5917K|00:00:03.26 | 33268 | 33263 | 0 | | | | |
| 25 | TABLE ACCESS FULL | DATAREVIEW | 1 | 315K| 315K|00:00:00.17 | 3338 | 0 | 0 | | | | |
| 26 | TABLE ACCESS FULL | COMPS | 1 | 66506 | 66506 |00:00:00.04 | 659 | 0 | 0 | | | | |
| 27 | VIEW | | 1 | 26097 | 10622 |00:00:00.46 | 3933 | 35 | 35 | | | | |
| 28 | SORT GROUP BY | | 1 | 26097 | 10622 |00:00:00.45 | 3933 | 35 | 35 | 974K| 710K| 865K (0)| |
|* 29 | HASH JOIN | | 1 | 26097 | 26097 |00:00:00.38 | 3933 | 35 | 35 | 1543K| 1122K| 1720K (0)| 1024 |
| 30 | TABLE ACCESS FULL | OREP | 1 | 26097 | 26097 |00:00:00.06 | 596 | 0 | 0 | | | | |
| 31 | TABLE ACCESS FULL | DATAREVIEW | 1 | 315K| 315K|00:00:00.23 | 3337 | 0 | 0 | | | | |
| 32 | SORT AGGREGATE | | 74449 | 1 | 74449 |00:00:01.92 | 75528 | 0 | 0 | | | | |
| 33 | FIRST ROW | | 74449 | 1 | 74449 |00:00:01.43 | 75528 | 0 | 0 | | | | |
|* 34 | INDEX RANGE SCAN (MIN/MAX)| SYS_C008600 | 74449 | 1 | 74449 |00:00:01.21 | 75528 | 0 | 0 | | | | |
| 35 | SORT AGGREGATE | | 2400 | 1 | 2400 |00:00:00.08 | 4259 | 0 | 0 | | | | |
| 36 | FIRST ROW | | 2400 | 1 | 2400 |00:00:00.07 | 4259 | 0 | 0 | | | | |
|* 37 | INDEX RANGE SCAN (MIN/MAX)| SYS_C008513 | 2400 | 1 | 2400 |00:00:00.06 | 4259 | 0 | 0 | | | | |
|* 38 | TABLE ACCESS BY INDEX ROWID | PERIO | 100K| 1 | 85666 |00:00:09.27 | 1342K| 0 | 0 | | | | |
|* 39 | INDEX RANGE SCAN | SYS_C008501 | 100K| 1 | 1398K|00:00:03.00 | 107K| 0 | 0 | | | | |
| 40 | VIEW | | 102K| 1 | 99199 |00:00:01.12 | 13431 | 0 | 0 | | | | |
|* 41 | INDEX FULL SCAN | SYS_C008483 | 102K| 1 | 99199 |00:00:00.75 | 13431 | 0 | 0 | | | | |
| 42 | TABLE ACCESS BY INDEX ROWID | CRD_DIVISION | 102K| 1 | 98748 |00:00:02.14 | 205K| 0 | 0 | | | | |
|* 43 | INDEX UNIQUE SCAN | SYS_C008507 | 102K| 1 | 98748 |00:00:01.17 | 106K| 0 | 0 | | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (RELATified by operation id):
---------------------------------------------------
6 - filter(("SERE"."EFFECTIVE_DATE"= AND "MRRE"."EFFECTIVE_DATE"=))
7 - access("LATEST_OREPS_PER_COMPS"."COMPS_ID"="COMPS"."COMPS_ID")
8 - access("RELAT"."UNIT_CLASS_ID"="USR_CRD"."UNIT_CLASS_ID" AND "RELAT"."CRD_ID"="USR_CRD"."CRD_ID")
10 - access("MRRE"."CRD_REVISION_ID"="TERE"."CRD_REVISION_ID")
12 - access("COMPS"."UNIT_ID"="UNIT"."UNIT_ID")
14 - access("COMPS"."RELAT_ID"="RELAT"."RELAT_ID")
filter(("SETP"."UNIT_CLASS_ID" IS NULL OR ("SETP"."UNIT_CLASS_ID"="RELAT"."UNIT_CLASS_ID" AND "SETP"."UNIT_CLASS_ID" IS NOT NULL)))
16 - access("DATAREVIEW"."COMPS_ID"="COMPS"."COMPS_ID")
filter(("SETP"."RELAT_ID" IS NULL OR ("SETP"."RELAT_ID"="COMPS"."RELAT_ID" AND "SETP"."RELAT_ID" IS NOT NULL)))
17 - access("DATAREVIEW"."DATAREVIEW_ID"="DATA"."DATAREVIEW_ID")
18 - access("TERE"."DATTPOINT_ID"="DATA"."DATTPOINT_ID")
filter(((CASE WHEN REGEXP_LIKE ("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A70890300000000027289030000000000000000800
10000EC90CB2B10000200000000000000000081000000') ) THEN 1 ELSE 0 END =1 AND INTERNAL_FUNCTION("TERE"."DATATYPE") AND (("SERE"."OPERATOR"='LT' AND
TO_NUMBER("DATA"."DATA")<TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='LTE' AND TO_NUMBER("DATA"."DATA")<=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
("SERE"."OPERATOR"='GTE' AND TO_NUMBER("DATA"."DATA")>=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='GT' AND
TO_NUMBER("DATA"."DATA")>TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='EQ' AND TO_NUMBER("DATA"."DATA")=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
("SERE"."OPERATOR"='NEQ' AND TO_NUMBER("DATA"."DATA")<>TO_NUMBER(TO_CHAR("SERE"."VALUE"))))) OR (CASE WHEN REGEXP_LIKE
("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A7089030000000002728903000000000000000080010000EC90CB2B100002000000000000000000810
00000') ) THEN 1 ELSE 0 END =0 AND "TERE"."DATATYPE"='O' AND (("SERE"."OPERATOR"='EQ' AND "SERE"."VALUE"=TO_NUMBER("DATA"."DATA")) OR ("SERE"."OPERATOR"='NEQ'
AND "SERE"."VALUE"<>TO_NUMBER("DATA"."DATA"))))))
19 - access("SERE"."DATPOINT_ID"="SETP"."DATPOINT_ID")
21 - access("SETP"."DATTPOINT_ID"="TERE"."DATTPOINT_ID")
29 - access("OREP"."DATAREVIEW_ID"="INSP"."DATAREVIEW_ID")
34 - access("SERE2"."DATPOINT_ID"=:B1)
37 - access("MRRE2"."CRD_ID"=:B1)
38 - filter("DATAREVIEW"."COMPLETEDATE"<="PERIO"."ENDDATE")
39 - access("COMPS"."UNIT_ID"="PERIO"."UNIT_ID" AND "DATAREVIEW"."COMPLETEDATE">="PERIO"."STARTDATE")
41 - filter(("PERIO"."LOC_ID"="LOC"."LOC_ID" OR ("PERIO"."PERIO_ID" IS NULL AND "from$_subquery$_030"."LOCAHOME"="LOC"."LOC_ID")))
43 - access("RELAT"."CRD_ID"="CRD_DIVISION"."CRD_ID" AND "LOC"."LOC_ID"="CRD_DIVISION"."LOC_ID")
My Thoughts / Questions
So following the usual process I retrieve the execution plan above with the GATHER_PLAN_STATISTICS hint enabled and began looking through to find areas where the A-Rows severely deviated from the E-Rows. It looks like the problem begins on ID = 18
|* 18 | HASH JOIN | | 1 | 4411 | 808K|00:08:32.44 | 36173 | 43253 | 9990 | 9216K| 1980K| 8459K (1)| 86016 |
18 - access("TERE"."DATTPOINT_ID"="DATA"."DATTPOINT_ID")
filter(((CASE WHEN REGEXP_LIKE ("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A70890300000000027289030000000000000000800
10000EC90CB2B10000200000000000000000081000000') ) THEN 1 ELSE 0 END =1 AND INTERNAL_FUNCTION("TERE"."DATATYPE") AND (("SERE"."OPERATOR"='LT' AND
TO_NUMBER("DATA"."DATA")<TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='LTE' AND TO_NUMBER("DATA"."DATA")<=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
("SERE"."OPERATOR"='GTE' AND TO_NUMBER("DATA"."DATA")>=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='GT' AND
TO_NUMBER("DATA"."DATA")>TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='EQ' AND TO_NUMBER("DATA"."DATA")=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
("SERE"."OPERATOR"='NEQ' AND TO_NUMBER("DATA"."DATA")<>TO_NUMBER(TO_CHAR("SERE"."VALUE"))))) OR (CASE WHEN REGEXP_LIKE
("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A7089030000000002728903000000000000000080010000EC90CB2B100002000000000000000000810
00000') ) THEN 1 ELSE 0 END =0 AND "TERE"."DATATYPE"='O' AND (("SERE"."OPERATOR"='EQ' AND "SERE"."VALUE"=TO_NUMBER("DATA"."DATA")) OR ("SERE"."OPERATOR"='NEQ'
AND "SERE"."VALUE"<>TO_NUMBER("DATA"."DATA"))))))
As you can see there is a
drastic difference in the E-Rows (4411) versus the A-rows (808K).
This corresponds directly to the last predicate in the main query:
AND (
( data.is_numeric = 1
AND latest_dattpoints.datatype IN ('I','R')
AND (
( latest_datpoints.operator = 'LT'
AND TO_NUMBER(data.data) < TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'LTE'
AND TO_NUMBER(data.data) <= TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'GTE'
AND TO_NUMBER(data.data) >= TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'GT'
AND TO_NUMBER(data.data) > TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'EQ'
AND TO_NUMBER(data.data) = TO_NUMBER(latest_datpoints.value)
)
OR ( latest_datpoints.operator = 'NEQ'
AND TO_NUMBER(data.data) != TO_NUMBER(latest_datpoints.value)
)
)
)
OR ( data.is_numeric = 0
AND latest_dattpoints.datatype IN ('O')
AND (
( latest_datpoints.operator = 'EQ'
AND data.data = latest_datpoints.value
)
OR ( latest_datpoints.operator = 'NEQ'
AND data.data != latest_datpoints.value
)
)
)
)
Basically the code is doing a comparison depending on the
operator. It even handles if the data is
numeric or not.
Unfortunately the data.data column is a VARCHAR2 column which stores various different types of data from strings to numeric values.
Given that the data model can't really be changed significantly I can't see how I can provide ANY additional information to the optimizer to correct this cardinality error. That's why I'm here. Any suggestions?