Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

How can I improve the optimizer's estimates?

CentinulFeb 24 2011 — edited Feb 25 2011
OS
Microsoft Windows Server 2003 R2 (32-bit)
DB Version
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for 32-bit Windows: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
Preface

I got this query dropped on my desk today looking for help to improve performance. :)

I've also replaced table names, column names, and other values as well, so it may appear a little cryptic at first.

I believe I've located the problem (see below), but I'm at a loss on the best way to fix it.

Query
WITH latest_dattpoints AS
(
        SELECT tere.dattpoint_id
             , tere.datatype
             , tere.sortkey
        FROM   dattpoint_revision tere
        JOIN   crd_revision       mrre ON mrre.crd_revision_id = tere.crd_revision_id
        WHERE  mrre.effective_date = ( SELECT MAX(mrre2.effective_date)
                                       FROM   crd_revision mrre2
                                       WHERE  mrre2.crd_id = mrre.crd_id
                                     )
),   latest_datpoints AS
(      
       SELECT setp.datpoint_id
            , setp.dattpoint_id
            , setp.unit_class_id
            , setp.relat_id
            , sere.value
            , sere.operator
            , sere.color
       FROM   datpoint          setp
       JOIN   datpoint_revision sere ON sere.datpoint_id = setp.datpoint_id
       WHERE  sere.effective_date = ( SELECT MAX(sere2.effective_date)
                                      FROM   datpoint_revision sere2
                                      WHERE  sere2.datpoint_id = sere.datpoint_id
                                    )
),   latest_oreps_per_comps AS
(      SELECT comp.comps_id
            , MAX(orep.orep_id) KEEP (DENSE_RANK LAST ORDER BY orep.orepdate NULLS FIRST) AS orep_id
            , MAX(orep.orepdate)                                                          AS last_orep_date
       FROM   comps      comp
       JOIN   datareview insp ON insp.comps_id       = comp.comps_id
       JOIN   orep       orep ON orep.datareview_id  = insp.datareview_id
       GROUP BY comp.comps_id
)
SELECT /*+ gather_plan_statistics */
       data.data_id
     , latest_datpoints.datpoint_id
     , datareview.datareview_id
     , latest_dattpoints.dattpoint_id
     , data.data
     , latest_datpoints.operator
     , latest_datpoints.value
     , latest_datpoints.color
     , latest_dattpoints.datatype
     , latest_dattpoints.sortkey
     , datareview.completedate
     , comps.comps_id
     , loc.loc_id
     , crd_division.division_id
     , unit.unit_id
     , unit.squadron_id
     , relat.unit_class_id
     , relat.mip_id
     , relat.crd_id
     , usr_crd.usr_id
     , latest_oreps_per_comps.orep_id
     , latest_oreps_per_comps.last_orep_date
FROM      datareview
JOIN      comps                  ON datareview.comps_id                = comps.comps_id
JOIN      unit                   ON comps.unit_id                      = unit.unit_id
JOIN      relat                  ON comps.relat_id                     = relat.relat_id
JOIN      data                   ON datareview.datareview_id           = data.datareview_id
JOIN      latest_dattpoints      ON latest_dattpoints.dattpoint_id     = data.dattpoint_id
JOIN      latest_datpoints       ON latest_datpoints.dattpoint_id      = latest_dattpoints.dattpoint_id
LEFT JOIN latest_oreps_per_comps ON latest_oreps_per_comps.comps_id    = comps.comps_id
LEFT JOIN usr_crd                ON relat.crd_id                       = usr_crd.crd_id 
                                 AND relat.unit_class_id               = usr_crd.unit_class_id
LEFT JOIN perio                  ON comps.unit_id                      = perio.unit_id 
                                 AND datareview.completedate           BETWEEN perio.startdate AND perio.enddate
LEFT JOIN loc                    ON perio.loc_id  = loc.loc_id 
                                 OR (   perio.perio_id       IS NULL 
                                    AND unit.locahome        = loc.loc_id
                                    )
LEFT JOIN crd_division           ON loc.loc_id                         = crd_division.loc_id 
                                 AND relat.crd_id                      = crd_division.crd_id
WHERE     (  latest_datpoints.unit_class_id IS NULL
          OR latest_datpoints.unit_class_id = relat.unit_class_id
          )
AND       (  latest_datpoints.relat_id IS NULL
          OR latest_datpoints.relat_id = comps.relat_id
          )
AND       (
            (    data.is_numeric = 1
             AND latest_dattpoints.datatype IN ('I','R')
             AND (   
                    (   latest_datpoints.operator =  'LT'
                    AND TO_NUMBER(data.data)      <  TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'LTE'
                    AND TO_NUMBER(data.data)      <= TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'GTE'
                    AND TO_NUMBER(data.data)      >= TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'GT'
                    AND TO_NUMBER(data.data)      >  TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'EQ'
                    AND TO_NUMBER(data.data)      =  TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'NEQ'
                    AND TO_NUMBER(data.data)      != TO_NUMBER(latest_datpoints.value)
                    )                        
                 )
            )
         OR (    data.is_numeric = 0
            AND  latest_dattpoints.datatype IN ('O')
            AND  (   
                    (   latest_datpoints.operator = 'EQ'
                    AND data.data                 = latest_datpoints.value
                    )
                 OR (   latest_datpoints.operator = 'NEQ'
                    AND data.data                 != latest_datpoints.value
                    )
                 )
            )
         )
Execution Plan
Plan hash value: 3808651332

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name               | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                    |      1 |        |    102K|00:06:14.00 |    1684K|  76215 |  42952 |       |       |          |         |
|   1 |  NESTED LOOPS OUTER                |                    |      1 |      1 |    102K|00:06:14.00 |    1684K|  76215 |  42952 |       |       |          |         |
|   2 |   VIEW                             |                    |      1 |      1 |    102K|00:06:11.36 |    1479K|  76215 |  42952 |       |       |          |         |
|   3 |    NESTED LOOPS OUTER              |                    |      1 |      1 |    102K|00:06:11.06 |    1479K|  76215 |  42952 |       |       |          |         |
|   4 |     NESTED LOOPS OUTER             |                    |      1 |      1 |    102K|00:06:08.64 |    1466K|  76215 |  42952 |       |       |          |         |
|   5 |      VIEW                          |                    |      1 |      1 |    100K|00:05:58.88 |     124K|  76215 |  42952 |       |       |          |         |
|*  6 |       FILTER                       |                    |      1 |        |    100K|00:05:58.57 |     124K|  76215 |  42952 |       |       |          |         |
|*  7 |        HASH JOIN OUTER             |                    |      1 |   4362 |    795K|00:05:55.38 |   44269 |  76215 |  42952 |   118M|  7538K| 3508K (1)|     113K|
|*  8 |         HASH JOIN RIGHT OUTER      |                    |      1 |   4362 |    795K|00:05:49.48 |   40336 |  62168 |  28905 |   888K|   888K| 1337K (0)|         |
|   9 |          TABLE ACCESS FULL         | USR_CRD            |      1 |   7419 |   7419 |00:00:00.01 |      33 |      0 |      0 |       |       |          |         |
|* 10 |          HASH JOIN                 |                    |      1 |   4362 |    795K|00:05:47.45 |   40303 |  62168 |  28905 |   848K|   848K| 1354K (0)|         |
|  11 |           TABLE ACCESS FULL        | CRD_REVISION       |      1 |   4385 |   4385 |00:00:00.01 |      65 |      0 |      0 |       |       |          |         |
|* 12 |           HASH JOIN                |                    |      1 |   4362 |    795K|00:05:45.44 |   40238 |  62168 |  28905 |   921K|   921K| 1225K (0)|         |
|  13 |            TABLE ACCESS FULL       | UNIT               |      1 |    168 |    168 |00:00:00.01 |       3 |      0 |      0 |       |       |          |         |
|* 14 |            HASH JOIN               |                    |      1 |   4362 |    795K|00:05:43.52 |   40235 |  62168 |  28905 |   919K|   919K| 1353K (0)|         |
|  15 |             TABLE ACCESS FULL      | RELAT              |      1 |   7860 |   7860 |00:00:00.01 |      65 |      0 |      0 |       |       |          |         |
|* 16 |             HASH JOIN              |                    |      1 |   4411 |    808K|00:05:43.96 |   40170 |  62168 |  28905 |    88M|  7674K| 3033K (1)|   87040 |
|* 17 |              HASH JOIN             |                    |      1 |   4411 |    808K|00:05:42.75 |   39511 |  52028 |  18765 |    77M|  7759K| 3036K (1)|   75776 |
|* 18 |               HASH JOIN            |                    |      1 |   4411 |    808K|00:08:32.44 |   36173 |  43253 |   9990 |  9216K|  1980K| 8459K (1)|   86016 |
|* 19 |                HASH JOIN           |                    |      1 |  88534 |    109K|00:00:00.41 |    2905 |      0 |      0 |  2175K|  1055K| 2952K (0)|         |
|  20 |                 TABLE ACCESS FULL  | DATPOINT_REVISION  |      1 |  34319 |  34319 |00:00:00.02 |     324 |      0 |      0 |       |       |          |         |
|* 21 |                 HASH JOIN          |                    |      1 |  81154 |  96329 |00:00:00.26 |    2581 |      0 |      0 |  1610K|  1171K| 1978K (0)|         |
|  22 |                  TABLE ACCESS FULL | DATPOINT           |      1 |  31416 |  31416 |00:00:00.02 |     144 |      0 |      0 |       |       |          |         |
|  23 |                  TABLE ACCESS FULL | DATTPOINT_REVISION |      1 |    235K|    235K|00:00:00.14 |    2437 |      0 |      0 |       |       |          |         |
|  24 |                TABLE ACCESS FULL   | DATA               |      1 |   5917K|   5917K|00:00:03.26 |   33268 |  33263 |      0 |       |       |          |         |
|  25 |               TABLE ACCESS FULL    | DATAREVIEW         |      1 |    315K|    315K|00:00:00.17 |    3338 |      0 |      0 |       |       |          |         |
|  26 |              TABLE ACCESS FULL     | COMPS              |      1 |  66506 |  66506 |00:00:00.04 |     659 |      0 |      0 |       |       |          |         |
|  27 |         VIEW                       |                    |      1 |  26097 |  10622 |00:00:00.46 |    3933 |     35 |     35 |       |       |          |         |
|  28 |          SORT GROUP BY             |                    |      1 |  26097 |  10622 |00:00:00.45 |    3933 |     35 |     35 |   974K|   710K|  865K (0)|         |
|* 29 |           HASH JOIN                |                    |      1 |  26097 |  26097 |00:00:00.38 |    3933 |     35 |     35 |  1543K|  1122K| 1720K (0)|    1024 |
|  30 |            TABLE ACCESS FULL       | OREP               |      1 |  26097 |  26097 |00:00:00.06 |     596 |      0 |      0 |       |       |          |         |
|  31 |            TABLE ACCESS FULL       | DATAREVIEW         |      1 |    315K|    315K|00:00:00.23 |    3337 |      0 |      0 |       |       |          |         |
|  32 |        SORT AGGREGATE              |                    |  74449 |      1 |  74449 |00:00:01.92 |   75528 |      0 |      0 |       |       |          |         |
|  33 |         FIRST ROW                  |                    |  74449 |      1 |  74449 |00:00:01.43 |   75528 |      0 |      0 |       |       |          |         |
|* 34 |          INDEX RANGE SCAN (MIN/MAX)| SYS_C008600        |  74449 |      1 |  74449 |00:00:01.21 |   75528 |      0 |      0 |       |       |          |         |
|  35 |        SORT AGGREGATE              |                    |   2400 |      1 |   2400 |00:00:00.08 |    4259 |      0 |      0 |       |       |          |         |
|  36 |         FIRST ROW                  |                    |   2400 |      1 |   2400 |00:00:00.07 |    4259 |      0 |      0 |       |       |          |         |
|* 37 |          INDEX RANGE SCAN (MIN/MAX)| SYS_C008513        |   2400 |      1 |   2400 |00:00:00.06 |    4259 |      0 |      0 |       |       |          |         |
|* 38 |      TABLE ACCESS BY INDEX ROWID   | PERIO              |    100K|      1 |  85666 |00:00:09.27 |    1342K|      0 |      0 |       |       |          |         |
|* 39 |       INDEX RANGE SCAN             | SYS_C008501        |    100K|      1 |   1398K|00:00:03.00 |     107K|      0 |      0 |       |       |          |         |
|  40 |     VIEW                           |                    |    102K|      1 |  99199 |00:00:01.12 |   13431 |      0 |      0 |       |       |          |         |
|* 41 |      INDEX FULL SCAN               | SYS_C008483        |    102K|      1 |  99199 |00:00:00.75 |   13431 |      0 |      0 |       |       |          |         |
|  42 |   TABLE ACCESS BY INDEX ROWID      | CRD_DIVISION       |    102K|      1 |  98748 |00:00:02.14 |     205K|      0 |      0 |       |       |          |         |
|* 43 |    INDEX UNIQUE SCAN               | SYS_C008507        |    102K|      1 |  98748 |00:00:01.17 |     106K|      0 |      0 |       |       |          |         |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (RELATified by operation id):
---------------------------------------------------

   6 - filter(("SERE"."EFFECTIVE_DATE"= AND "MRRE"."EFFECTIVE_DATE"=))
   7 - access("LATEST_OREPS_PER_COMPS"."COMPS_ID"="COMPS"."COMPS_ID")
   8 - access("RELAT"."UNIT_CLASS_ID"="USR_CRD"."UNIT_CLASS_ID" AND "RELAT"."CRD_ID"="USR_CRD"."CRD_ID")
  10 - access("MRRE"."CRD_REVISION_ID"="TERE"."CRD_REVISION_ID")
  12 - access("COMPS"."UNIT_ID"="UNIT"."UNIT_ID")
  14 - access("COMPS"."RELAT_ID"="RELAT"."RELAT_ID")
       filter(("SETP"."UNIT_CLASS_ID" IS NULL OR ("SETP"."UNIT_CLASS_ID"="RELAT"."UNIT_CLASS_ID" AND "SETP"."UNIT_CLASS_ID" IS NOT NULL)))
  16 - access("DATAREVIEW"."COMPS_ID"="COMPS"."COMPS_ID")
       filter(("SETP"."RELAT_ID" IS NULL OR ("SETP"."RELAT_ID"="COMPS"."RELAT_ID" AND "SETP"."RELAT_ID" IS NOT NULL)))
  17 - access("DATAREVIEW"."DATAREVIEW_ID"="DATA"."DATAREVIEW_ID")
  18 - access("TERE"."DATTPOINT_ID"="DATA"."DATTPOINT_ID")
       filter(((CASE  WHEN  REGEXP_LIKE ("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A70890300000000027289030000000000000000800
              10000EC90CB2B10000200000000000000000081000000') ) THEN 1 ELSE 0 END =1 AND INTERNAL_FUNCTION("TERE"."DATATYPE") AND (("SERE"."OPERATOR"='LT' AND
              TO_NUMBER("DATA"."DATA")<TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='LTE' AND TO_NUMBER("DATA"."DATA")<=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
              ("SERE"."OPERATOR"='GTE' AND TO_NUMBER("DATA"."DATA")>=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='GT' AND
              TO_NUMBER("DATA"."DATA")>TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='EQ' AND TO_NUMBER("DATA"."DATA")=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
              ("SERE"."OPERATOR"='NEQ' AND TO_NUMBER("DATA"."DATA")<>TO_NUMBER(TO_CHAR("SERE"."VALUE"))))) OR (CASE  WHEN  REGEXP_LIKE
              ("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A7089030000000002728903000000000000000080010000EC90CB2B100002000000000000000000810
              00000') ) THEN 1 ELSE 0 END =0 AND "TERE"."DATATYPE"='O' AND (("SERE"."OPERATOR"='EQ' AND "SERE"."VALUE"=TO_NUMBER("DATA"."DATA")) OR ("SERE"."OPERATOR"='NEQ'
              AND "SERE"."VALUE"<>TO_NUMBER("DATA"."DATA"))))))
  19 - access("SERE"."DATPOINT_ID"="SETP"."DATPOINT_ID")
  21 - access("SETP"."DATTPOINT_ID"="TERE"."DATTPOINT_ID")
  29 - access("OREP"."DATAREVIEW_ID"="INSP"."DATAREVIEW_ID")
  34 - access("SERE2"."DATPOINT_ID"=:B1)
  37 - access("MRRE2"."CRD_ID"=:B1)
  38 - filter("DATAREVIEW"."COMPLETEDATE"<="PERIO"."ENDDATE")
  39 - access("COMPS"."UNIT_ID"="PERIO"."UNIT_ID" AND "DATAREVIEW"."COMPLETEDATE">="PERIO"."STARTDATE")
  41 - filter(("PERIO"."LOC_ID"="LOC"."LOC_ID" OR ("PERIO"."PERIO_ID" IS NULL AND "from$_subquery$_030"."LOCAHOME"="LOC"."LOC_ID")))
  43 - access("RELAT"."CRD_ID"="CRD_DIVISION"."CRD_ID" AND "LOC"."LOC_ID"="CRD_DIVISION"."LOC_ID")
My Thoughts / Questions

So following the usual process I retrieve the execution plan above with the GATHER_PLAN_STATISTICS hint enabled and began looking through to find areas where the A-Rows severely deviated from the E-Rows. It looks like the problem begins on ID = 18
|* 18 |               HASH JOIN            |                    |      1 |   4411 |    808K|00:08:32.44 |   36173 |  43253 |   9990 |  9216K|  1980K| 8459K (1)|   86016 |

  18 - access("TERE"."DATTPOINT_ID"="DATA"."DATTPOINT_ID")
       filter(((CASE  WHEN  REGEXP_LIKE ("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A70890300000000027289030000000000000000800
              10000EC90CB2B10000200000000000000000081000000') ) THEN 1 ELSE 0 END =1 AND INTERNAL_FUNCTION("TERE"."DATATYPE") AND (("SERE"."OPERATOR"='LT' AND
              TO_NUMBER("DATA"."DATA")<TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='LTE' AND TO_NUMBER("DATA"."DATA")<=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
              ("SERE"."OPERATOR"='GTE' AND TO_NUMBER("DATA"."DATA")>=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='GT' AND
              TO_NUMBER("DATA"."DATA")>TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR ("SERE"."OPERATOR"='EQ' AND TO_NUMBER("DATA"."DATA")=TO_NUMBER(TO_CHAR("SERE"."VALUE"))) OR
              ("SERE"."OPERATOR"='NEQ' AND TO_NUMBER("DATA"."DATA")<>TO_NUMBER(TO_CHAR("SERE"."VALUE"))))) OR (CASE  WHEN  REGEXP_LIKE
              ("DATA",'^\s*[-+]?(\d+\.?\d*|\d*\.?\d+)(e\+\d+|E\d+)?%?\s*$',HEXTORAW('608FCB2B3A7089030000000002728903000000000000000080010000EC90CB2B100002000000000000000000810
              00000') ) THEN 1 ELSE 0 END =0 AND "TERE"."DATATYPE"='O' AND (("SERE"."OPERATOR"='EQ' AND "SERE"."VALUE"=TO_NUMBER("DATA"."DATA")) OR ("SERE"."OPERATOR"='NEQ'
              AND "SERE"."VALUE"<>TO_NUMBER("DATA"."DATA"))))))
As you can see there is a drastic difference in the E-Rows (4411) versus the A-rows (808K).

This corresponds directly to the last predicate in the main query:
AND       (
            (    data.is_numeric = 1
             AND latest_dattpoints.datatype IN ('I','R')
             AND (   
                    (   latest_datpoints.operator =  'LT'
                    AND TO_NUMBER(data.data)      <  TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'LTE'
                    AND TO_NUMBER(data.data)      <= TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'GTE'
                    AND TO_NUMBER(data.data)      >= TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'GT'
                    AND TO_NUMBER(data.data)      >  TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'EQ'
                    AND TO_NUMBER(data.data)      =  TO_NUMBER(latest_datpoints.value)
                    )
                 OR (   latest_datpoints.operator =  'NEQ'
                    AND TO_NUMBER(data.data)      != TO_NUMBER(latest_datpoints.value)
                    )                        
                 )
            )
         OR (    data.is_numeric = 0
            AND  latest_dattpoints.datatype IN ('O')
            AND  (   
                    (   latest_datpoints.operator = 'EQ'
                    AND data.data                 = latest_datpoints.value
                    )
                 OR (   latest_datpoints.operator = 'NEQ'
                    AND data.data                 != latest_datpoints.value
                    )
                 )
            )
         )
Basically the code is doing a comparison depending on the operator. It even handles if the data is numeric or not. Unfortunately the data.data column is a VARCHAR2 column which stores various different types of data from strings to numeric values.

Given that the data model can't really be changed significantly I can't see how I can provide ANY additional information to the optimizer to correct this cardinality error. That's why I'm here. Any suggestions?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Mar 25 2011
Added on Feb 24 2011
36 comments
448 views