Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Wrong Cardinality Estimation

User_OCZ1TNov 24 2018 — edited Dec 16 2018

Hi We are using version 11.2.0.4 of oracle Exadata. We have a query which is suffering due to bad cardinality estimation and after digging into that i found one of the table which is partitioned is having issue in cardinailty estimation with a filter on one column(INDICATOR). Below is the query and its plan. and the statistics.  Apology for the bad formatting of the table/column statistics, somehow i am not able to make the alignment proper in this editor.

The global column stats shows there exists no histogram on the column and num_distinct as "2" as it was an indicator(Y/N), so i was expecting the cardinality estimation to be num_rows/num_distinct=1335,425,399/2=667,712,700 , but the plan shows estimation as 1. I did notice the density column is having some very low value (5.56393360313218E-10) stored in it , ideally it should be .5 considering two distinct values and no histogram, so want to understand why its happening and if this density column is the cause of this bad cardinality estimation? And why this density is coming so bad?

Again when i put the additional filter on partition key as then the estimation  shows as ~76million which is correct one. it seems to be coming from DBA_PART_HISTOGRAMS. still not able to decode the value stored in ENDPOINT_VALUE to validate?

This table is range partitione on column SDT and column "INDICATOR" is only holding values 'Y' and 'N'. And if i just query the specific partition, i see below results.

SQL> select INDICATOR,count(*) from RSTDL partition(RSTDL_20181122_P) group by INDICATOR;

D   COUNT(*)

- ----------

Y   76377353

N  126695652

Global Table stats:-

TABLE_NAME    NUM_ROWS    BLOCKS        AVG_ROW_LEN    SAMPLE_SIZE    LAST_ANALYZED           GLOBAL_STATS

RSTDL              1335425399    49673661       268                       4916                   11/24/2018 12:33:29 PM    YES

All individual Partition stats :-

TABLE_NAME    PARTITION_NAME    NUM_ROWS    BLOCKS    AVG_ROW_LEN    SAMPLE_SIZE    LAST_ANALYZED    GLOBAL_STATS

RSTDL    RSTDL_20181122_P    203073005    7543488    256            203073005    11/23/2018 12:49:11 AM    YES

RSTDL    RSTDL_20170729_P    154399686    5757723    254            154399686    7/30/2017 5:37:14 AM    YES

RSTDL    RSTDL_20181119_P    195229089    7269843    255            195229089    11/20/2018 12:38:35 AM    YES

RSTDL    RSTDL_20181120_P    212601907    7905866    257            212601907    11/21/2018 12:38:47 AM    YES

RSTDL    RSTDL_20181121_P    211516854    7861260    257            211516854    11/22/2018 12:38:43 AM    YES

RSTDL    RSTDL_20181123_P    217522243    8082610    257            217522243    11/24/2018 12:39:42 AM    YES

RSTDL    RSTDL_20181124_P    155855612    5800301    256            155855612    11/24/2018 2:03:14 PM    YES

Global column stats:-

TABLE_NAME    COLUMN_NAME    NUM_DISTINCT    LOW_VALUE    HIGH_VALUE    DENSITY                    NUM_NULLS    NUM_BUCKETS    LAST_ANALYZED            SAMPLE_SIZE    GLOBAL_STATS    AVG_COL_LEN    HISTOGRAM

RSTDL        INDICATOR    2                4E            59            5.56393360313218E-10    0            1            11/24/2018 12:51:17 PM    1339276439    YES                2            NONE

Column stats specific to the one partition in which filter applied:-

TABLE_NAME    PARTITION_NAME        COLUMN_NAME    NUM_DISTINCT    LOW_VALUE    HIGH_VALUE    DENSITY                NUM_NULLS    NUM_BUCKETS    LAST_ANALYZED            SAMPLE_SIZE    GLOBAL_STATS    AVG_COL_LEN    HISTOGRAM

RSTDL        RSTDL_20181122_P    INDICATOR    2                4E            59            2.45109735048318E-9    0            2            11/23/2018 12:46:52 AM    5528        YES                 2            FREQUENCY

Result from DBA_PART_HISTOGRAMS for the  specific column and  partition:-

OWNER    TABLE_NAME    PARTITION_NAME    COLUMN_NAME    BUCKET_NUMBER    ENDPOINT_VALUE    ENDPOINT_ACTUAL_VALUE

user1        RSTDL             RSTDL_20181122_P    INDICATOR        3442                           4.04999154965717E35   

user1        RSTDL             RSTDL_20181122_P    INDICATOR        5528                           4.621144204096E35   

Result from DBA_tab_HISTOGRAMS for that specific column:-

OWNER    TABLE_NAME    COLUMN_NAME    ENDPOINT_NUMBER    ENDPOINT_VALUE    ENDPOINT_ACTUAL_VALUE

user1        RSTDL               INDICATOR            0                                   4.04999154965717E35   

user1        RSTDL               INDICATOR            1                                   4.621144204096E35   

SQL> select * from RSTDL

  2  where  RSTDL.INDICATOR = 'Y';

Elapsed: 00:00:00.05

Execution Plan

----------------------------------------------------------

Plan hash value: 1587933015

-----------------------------------------------------------------------------------------------------------------

| Id  | Operation                  | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

-----------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT           |                    |     1 |   268 |    14M  (1)| 43:12:14 |    |  |

|   1 |  PARTITION RANGE ALL       |                    |     1 |   268 |    14M  (1)| 43:12:14 |     1 |    11 |

|*  2 |   TABLE ACCESS STORAGE FULL| RSTDL              |     1 |   268 |    14M  (1)| 43:12:14 |     1 |    11 |

-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - storage("RSTDL"."INDICATOR"='Y')

       filter("RSTDL"."INDICATOR"='Y')

      

      

SQL> select * from RSTDL

  2  where  RSTDL.INDICATOR = 'Y'

  3  and SDT='22-nov-2018';

Elapsed: 00:00:00.08

Execution Plan

----------------------------------------------------------

Plan hash value: 147367774

-----------------------------------------------------------------------------------------------------------------

| Id  | Operation                  | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

-----------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT           |                    |    76M|    18G|  2153K  (1)| 06:34:54 |    |  |

|   1 |  PARTITION RANGE SINGLE    |                    |    76M|    18G|  2153K  (1)| 06:34:54 |     5 |     5 |

|*  2 |   TABLE ACCESS STORAGE FULL|              RSTDL |    76M|    18G|  2153K  (1)| 06:34:54 |     5 |     5 |

-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - storage("RSTDL"."INDICATOR"='Y' AND "SDT"=TO_DATE(' 2018-11-22 00:00:00',

              'syyyy-mm-dd hh24:mi:ss'))

       filter("RSTDL"."INDICATOR"='Y' AND "SDT"=TO_DATE(' 2018-11-22 00:00:00',

              'syyyy-mm-dd hh24:mi:ss'))

This post has been answered by Jonathan Lewis on Nov 25 2018
Jump to Answer
Comments
Post Details
Added on Nov 24 2018
30 comments
1,432 views