DB verion : 12.1
I have a Siebel DB in my shop.
A new functionality was added in the code and there is a new SQL for it.
In dev environment, this SQL was performing badly. In this SQL query, there is a table which is 50 GB in size.
To fix the performance issue, I gathered stats with 100 % sampling as shown below.
exec dbms_stats.gather_table_stats
(
ownname => 'SIEBEL',
tabname => 'LARGE_TABLE',
degree => 2,
cascade => true,
no_invalidate => false,
estimate_percent => 100
);
Now, the optimizer is picking a good plan in dev environment.
But, Siebel guys use a custom stored proc to gather stats which uses the below logic to gather stats.
Logic of determining sampling percentage based on table row count in Siebel custom stored proc which will gather stats
IF num_rows > 100 THEN
IF i.num_rows IS NULL OR i.last_analyzed IS NULL THEN -- no stats, then count rows (sample 1%)
EXECUTE IMMEDIATE 'SELECT COUNT(*) * 100 FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE(1)' INTO num_rows;
ELSE -- use num rows from prior stats gathering
num_rows := i.num_rows;
END IF;
END IF;
-- determine sample size and frequency as per current table size
IF num_rows BETWEEN 0 AND 1e6 THEN estimate_percent := '100'; frequency := SYSDATE - 21;
ELSIF num_rows BETWEEN 1e6 AND 1e7 THEN estimate_percent := '30'; frequency := SYSDATE - 28;
ELSIF num_rows BETWEEN 1e7 AND 1e8 THEN estimate_percent := '10'; frequency := SYSDATE - 35;
ELSIF num_rows BETWEEN 1e8 AND 1e9 THEN estimate_percent := '3'; frequency := SYSDATE - 42;
ELSE estimate_percent := '1'; frequency := SYSDATE - 49;
END IF;
-- if 11g then use auto sample size instead (ok only if gathering with no histograms)
IF l_rdbms_release >= 11 THEN
estimate_percent := 'DBMS_STATS.AUTO_SAMPLE_SIZE';
END IF;
This table has 7500000 rows (7.5 million rows). So, I think, the sampling will be 30% for this table if the Siebel custom stored proc is used to gather stats
Siebel application support guys are scared to run the gather stats with 100% sampling in production. They prefer their custom stored proc with the below logic to gather stats.
I want to know if there is any harm in gathering stats for a table (+ its indexes) with 100% sampling.