Thread: Statistics


Permlink Replies: 10 - Pages: 1 - Last Post: Jan 4, 2008 5:01 PM Last Post By: Richard Foote
mark100000

Posts: 71
Registered: 03/18/04
Statistics
Posted: Jan 3, 2008 6:25 AM
Click to report abuse...   Click to reply to this thread Reply
Hi All, i would like to know if it is possible to generate statistics via dbms_stats without generating any histograms. I know Oracle probably knows better than i do, but it is generating a hell of a lot that i do not think are required? Also what is your advice on how often to refresh statistics?

Finally i have another issue where a user is generating reports using a date range, once the range entered exceeds the what is recorded in the statistics the explain plan is ignored and the report takes forever. Is there a possible work around to try to avoid this issue, i have heard dynamic sampling metioned - is this something i should consider?

Many thanks in advance for any help.

Mark.
Santosh Kumar

Posts: 1,042
Registered: 06/01/06
Re: Statistics
Posted: Jan 3, 2008 6:46 AM   in response to: mark100000 in response to: mark100000
Click to report abuse...   Click to reply to this thread Reply
What is the syntax that you use for dbms_stats?
If you don't put 'method_opt' parameter,I don't think dbms_stats will generate Histogram.

If there is a 10% of increment in the size of table.You should analyze the table.
Jonathan Lewis

Posts: 1,794
Registered: 01/23/07
Re: Statistics
Posted: Jan 3, 2008 6:47 AM   in response to: mark100000 in response to: mark100000
Click to report abuse...   Click to reply to this thread Reply
Hi All, i would like to know if it is possible to
generate statistics via dbms_stats without generating
any histograms. I know Oracle probably knows better
than i do, but it is generating a hell of a lot that
i do not think are required?

I suspect you're using 10g. As a starting point to your question, the following item from my blog may be helpful.
http://jonathanlewis.wordpress.com/2007/02/02/10g-upgrade/

Assuming you're using a very simple (minimum parameters) calls to dbms_stats, then you can probably get away with adding the following to your call:
method_opt => 'for all columns size 1' 

It's a fairly brutal change, though and doesn't produce any histograms at all. So some of your plans are likely to change for the worse because they need histogram help. You ought to test the impact of the change on a copy of the database before implementing it on production.

Finally i have another issue where a user is
generating reports using a date range, once the range
entered exceeds the what is recorded in the
statistics the explain plan is ignored and the report
takes forever.

This is a fairly common threat with sequence of time-based columns. The highest value stored keeps increasing, so the "high_value" statistics goes out of date, and the optimizer thinks you are asking for data that can't possibly exist and does something silly.

This question ties in with your question about 'how often'. There is a short note about "how often" on my website at: http://www.jlcomp.demon.co.uk/stats_i.html in passing it mentions the use of dbms_stats.set_column_stats() as a way of setting corrected, or improved, statistics for special cases without going to the expense of running a gather.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
stuart_wong

Posts: 106
Registered: 03/19/07
Re: Statistics
Posted: Jan 3, 2008 7:17 AM   in response to: mark100000 in response to: mark100000
Click to report abuse...   Click to reply to this thread Reply
What is your Oracle version? Do not assume Oracle is generating histogram statistics unnecessarily (though in 10g this seems to be the case on occasion), you will need to run some tests to prove or disapprove that this is the case, otherwise you might do more harm that good.

You can run some queries using sql trace and tkprof (or other tool) and review the plan and statistics. See what Oracle is doing, and then you can try and hint to what you think it should be doing, and then compare the run efficiencies. Of course this can be a tedious process since there are other plans which might be more optimal, but there are tools which will help you here by generating multiple plans with the corresponding run time statistics (Toad comes to mind). You can also use method_opt to change the histograms being gathered.

How often you collect statistics is based on your environment really. Usually you set monitoring to on and the method_opt to "auto", but this also depends on your environment since less than 10% change in data be may require re-analysis for an environment.

Not sure about the "explain plan being ignored once a range entered exceeds what is recorded in the statistics", there is always a plan being used, unless you are referring to the "usual plan" being ignored and another being used. Quite likely the optimizer has determined that a FTS is more efficient than an index, might be true or not, depends on your statistics. I'd advice to re-analyze and rerun the query with sql trace enabled to review what's happening.

Dynamic sampling is set to 1 in 9i and 2 in 10g, it's used when a table does not have statistics and another does in a query which is executed (sort of guessing the stats on the unanalyzed table). The range goes up to 12 and the effort of the optimizer increases with the level though 2 is usually sufficient.

You can review this link for more information:

http://www.dba-oracle.com/art_orafaq_cbo_stats.htm
Santosh Kumar

Posts: 1,042
Registered: 06/01/06
Re: Statistics
Posted: Jan 3, 2008 7:30 AM   in response to: Jonathan Lewis in response to: Jonathan Lewis
Click to report abuse...   Click to reply to this thread Reply
Very valuable reply, Jonathan.

We have just migrated one of our production database from 9iR2 32 bit(Windows) to 10gR2 64bit (Linux). We are getting a lot of performance issues. Some processes have started taking huge amount of time,on the other hand some processess are behaving better than before.Usually, we use following syntax for analysis:

exec dbms_stats.gather_table_stats('owner','table_name',estimate_percent=>dbms_stats.auto_sample_size,cascade=>true);

We don't use 'method_opt' parameter. The syntax was used to analyze the table in 9i. What modification should I made to get similar execution plan that I was getting in 9i with same syntax?
Jonathan Lewis

Posts: 1,794
Registered: 01/23/07
Re: Statistics
Posted: Jan 3, 2008 7:47 AM   in response to: Santosh Kumar in response to: Santosh Kumar
Click to report abuse...   Click to reply to this thread Reply
We have just migrated one of our production database
from 9iR2 32 bit(Windows) to 10gR2 64bit (Linux).

exec
dbms_stats.gather_table_stats('owner','table_name',est
imate_percent=>dbms_stats.auto_sample_size,cascade=>tr
ue);

We don't use 'method_opt' parameter. The syntax was
used to analyze the table in 9i. What modification
should I made to get similar execution plan that I
was getting in 9i with same syntax?

As it says in the blog -
So, before you start trouble-shooting specific SQL statements after the upgrade, go back to your stats collection scripts and modify the code to set any default values to the actual value they would have had in 9i. Then re-run the collection and see if most of your problems have gone away.

As a reference point, the blog shows you the defaults for 9i and 10g, so you can see which parameters need changing.

The trouble is, there are lots of other things that change between 9i and 10g - and it's not easy to decide on the optimum upgrade strategy. Once you've gone into production, and change like this is going to be a bit tense, because it's almost inevitable that some stuff will go faster, some will go slower.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk

mark100000

Posts: 71
Registered: 03/18/04
Re: Statistics
Posted: Jan 4, 2008 2:55 AM   in response to: mark100000 in response to: mark100000
Click to report abuse...   Click to reply to this thread Reply
Hi Jonathan, thankyou for your reply. I have had a look at the blog and the area i am confused with is the creation of histograms. Whichever method i choose to generate the statistics I produce more rows in the dba_tab_histogram table. I noticed on the blog that you have a query to identify 'real histograms' which when i run returns no rows.

I wonder can you clarify what do the rows in the dba_tab_histograms table represent? If they are not 'real histogram' then what are they and does the creation of these have any implications on db performance?

Many thanks in advance.
Mark.

BTW - version is 9i
Jonathan Lewis

Posts: 1,794
Registered: 01/23/07
Re: Statistics
Posted: Jan 4, 2008 3:03 AM   in response to: mark100000 in response to: mark100000
Click to report abuse...   Click to reply to this thread Reply
Barring a few oddities which result in one row per column in dba_tab_histograms, you should expect to find two rows per column because Oracle uses the table to store the low value and high value for the column.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
Richard Foote

Posts: 482
Registered: 12/13/99
Re: Statistics
Posted: Jan 4, 2008 3:36 AM   in response to: mark100000 in response to: mark100000
Click to report abuse...   Click to reply to this thread Reply
Hi Mark

IMHO, the single biggest and most significant change when moving to 10g is the difference in the default behavior of method_opt in dbms_stats.

Not only does it change from 'FOR ALL COLUMNS SIZE 1' (no histograms) to 'FOR ALL COLUMNS SIZE AUTO' (histograms for those tables that Oracle deems necessary based on data distribution and whether sql statements reference the columns), but it also generates a job by default to collect these statistics for you.

It all sounds like the ideal scenario, just let Oracle worry about it for you, except for the slight disadvantage that Oracle is not particularly "good" at determining which columns really need histograms and will likely generate many many many histograms unnecessarily while managing to still miss out on generating histograms on some of those columns that do need them.

In some environments with few tables and with low load this might not be an issue. However in environments with many tables and with high loads with many users executing many different queries and transactions, the impact of this change can be devastating.

My strong advice before moving to 10g is to ensure you collect statistics 10g in exactly the same manner in which you collected statistics previously by noting this change in behaviour.

Cheers

Richard Foote
http://richardfoote.wordpress.com/
Richard Foote

Posts: 482
Registered: 12/13/99
Re: Statistics
Posted: Jan 4, 2008 6:11 AM   in response to: mark100000 in response to: mark100000
Click to report abuse...   Click to reply to this thread Reply
I've just created a silly little demo on my blog that shows how the method_opt of 'FOR ALL COLUMNS SIZE AUTO' can produce (perhaps to some) surprising results and why I suggest caution when using it.

http://richardfoote.wordpress.com/2008/01/04/dbms_stats-method_opt-default-behaviour-changed-in-10g-be-careful/

Cheers

Richard Foote
http://richardfoote.wordpress.com/
Richard Foote

Posts: 482
Registered: 12/13/99
Re: Statistics
Posted: Jan 4, 2008 5:01 PM   in response to: Richard Foote in response to: Richard Foote
Click to report abuse...   Click to reply to this thread Reply
Just note I've updated the demo a little to highlight better how Oracle doesn't deal particularly well with outlier values when using the AUTO method_opt option.

http://richardfoote.wordpress.com/2008/01/04/dbms_stats-method_opt-default-behaviour-changed-in-10g-be-careful/

Cheers

Richard Foote
http://richardfoote.wordpress.com/
Legend
Guru Guru : 2500 - 1000000 pts
Expert Expert : 1000 - 2499 pts
Pro Pro : 500 - 999 pts
Journeyman Journeyman : 200 - 499 pts
Newbie Newbie : 0 - 199 pts
Oracle ACE Director
Oracle ACE Member
Oracle Employee ACE
Helpful Answer (5 pts)
Correct Answer (10 pts)

Point your RSS reader here for a feed of the latest messages in all forums