Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Any harm in gathering stats for a table with 100% sampling ?

james_pOct 1 2021 — edited Oct 1 2021

DB verion : 12.1

I have a Siebel DB in my shop.
A new functionality was added in the code and there is a new SQL for it.
In dev environment, this SQL was performing badly. In this SQL query, there is a table which is 50 GB in size.  

To fix the performance issue, I gathered stats with 100 % sampling as shown below.

exec dbms_stats.gather_table_stats
(
ownname => 'SIEBEL', 
tabname => 'LARGE_TABLE', 
degree => 2, 
cascade => true, 
no_invalidate => false, 
estimate_percent => 100
);

Now, the optimizer is picking a good plan in dev environment.
But, Siebel guys use a custom stored proc to gather stats which uses the below logic to gather stats.

Logic of determining sampling percentage based on table row count in Siebel custom stored proc which will gather stats

IF num_rows > 100 THEN
     IF i.num_rows IS NULL OR i.last_analyzed IS NULL THEN -- no stats, then count rows (sample 1%)
      EXECUTE IMMEDIATE 'SELECT COUNT(*) * 100 FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE(1)' INTO num_rows;
     ELSE -- use num rows from prior stats gathering
      num_rows := i.num_rows;
     END IF;
    END IF;

    -- determine sample size and frequency as per current table size
    IF  num_rows BETWEEN  0 AND 1e6 THEN estimate_percent := '100'; frequency := SYSDATE - 21;
    ELSIF num_rows BETWEEN 1e6 AND 1e7 THEN estimate_percent := '30'; frequency := SYSDATE - 28; 
    ELSIF num_rows BETWEEN 1e7 AND 1e8 THEN estimate_percent := '10'; frequency := SYSDATE - 35;
    ELSIF num_rows BETWEEN 1e8 AND 1e9 THEN estimate_percent :=  '3'; frequency := SYSDATE - 42;
    ELSE                  estimate_percent :=  '1'; frequency := SYSDATE - 49;
    END IF;

    -- if 11g then use auto sample size instead (ok only if gathering with no histograms)
    IF l_rdbms_release >= 11 THEN
     estimate_percent := 'DBMS_STATS.AUTO_SAMPLE_SIZE';
    END IF;

This table has 7500000 rows (7.5 million rows). So, I think, the sampling will be 30% for this table if the Siebel custom stored proc is used to gather stats

Siebel application support guys are scared to run the gather stats with 100% sampling in production. They prefer their custom stored proc with the below logic to gather stats.

I want to know if there is any harm in gathering stats for a table (+ its indexes) with 100% sampling.

This post has been answered by JohnWatson2 on Oct 1 2021
Jump to Answer
Comments
Post Details
Added on Oct 1 2021
3 comments
3,210 views