Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

CPU consumption 100%: How do I see which SQL is consuming?

664517Feb 25 2009 — edited Feb 25 2009
Hi,

I need help badly with this because I cannot get my head around the problem. I have a Windows server 2003 Enterprise 64-bit running Oracle 10.2.0.3 and last night it started consuming 100% CPU. The Windows performance monitor shows 100% CPU in any case, and it was difficult to even login via remote desktop.

It is Oracle.exe that is consuming, and there is only one database on this server. It is an Opteron 875 2.2Ghz with 8GB ram.

I want to see which SQL statement is responsible for the CPU consumption. I know exactly which user/programapplication is using it... it is useless to kick the session because the application connects immediately again from the app. server and runs the same thing. Restarting the database also has the same effect: session reconnects and CPU goes through the roof. Customer says that they are running the same stuff as always, and app. has not been updated recently. 100 % CPU to them, means something is wrong.

Incidently, I tried to CPU patch this database a couple of days beforehand, but there was not enough space on disk (needed 1.6GB) so opatch complained and would not continue. I simply restarted the database without having run opatch. I cannot imagine that this is the cause of this problem.

Anyhoo...

I have a script from Kyle Haileys web site to get the CPU for each sql_id using ash tables:

-- (c) Kyle Hailey 2007

col type for a10
select * from (
select
ash.SQL_ID , ash.SQL_PLAN_HASH_VALUE Plan_hash, aud.name type,
sum(decode(ash.session_state,'ON CPU',1,0)) "CPU",
sum(decode(ash.session_state,'WAITING',1,0)) -
sum(decode(ash.session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "WAIT" ,
sum(decode(ash.session_state,'WAITING', decode(wait_class, 'User I/O',1,0),0)) "IO" ,
sum(decode(ash.session_state,'ON CPU',1,1)) "TOTAL"
from v$active_session_history ash,
audit_actions aud
where SQL_ID is not NULL
and ash.sql_opcode=aud.action
and ash.sample_time > sysdate - &minutes /( 60*24)
group by sql_id, SQL_PLAN_HASH_VALUE , aud.name
order by sum(decode(session_state,'ON CPU',1,1)) desc
) where rownum < 10
/

This give following output:

SQL_ID PLAN_HASH TYPE CPU WAIT IO TOTAL
------------- ---------- ---------- ---------- ---------- ---------- ----------
cxk376gut06wn 3831248992 SELECT 12 145 5855 6012
4ffvdt72n46yt 4094592372 SELECT 0 3 1131 1134
dnxgm7bg8wqg1 4025412530 SELECT 3 13 337 353
3y0dyvkc9sx69 1764840833 SELECT 18 169 56 243
abz05715h08ct 1967470119 SELECT 0 7 175 182
4gd6b1r53yt88 0 UNKNOWN 0 92 4 96
9phhuzxpqwsgc 2662599208 SELECT 0 67 0 67
275bbj5czbrwp 629013258 SELECT 0 2 64 66
1dvrrg3aah2mu 75914095 SELECT 0 6 57 63

My assumption is that this gives the average CPU usage over the last x minutes and the total cpu usage for this sql_id. There are no sql running that show a very high average CPU, but the total is up in 6000 for the longest running sql, so I get the statement for the top sql_id that has highest total, and it is this:

SQL_ID cxk376gut06wn, child number 2
-------------------------------------
SELECT NVL(MIN(ID),0) FROM TABLE_OBFUSCATED WHERE TASK = :B1 AND
STATUS='JobQueue' AND MESSAGE_ID IS NULL

Plan hash value: 3831248992

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3231 (100)| |
| 1 | SORT AGGREGATE | | 1 | 42 | | |
|* 2 | TABLE ACCESS FULL| TABLE_OBFUSCATED | 3 | 126 | 3231 (1)| 00:00:39 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(("TASK"=:B1 AND "STATUS"='JobQueue' AND "MESSAGE_ID" IS NULL))


It doesn't seem to fit! It's such a simple statement, but according to ash this is the sql using the most total cpu over the last 30 minutes?? So I create a temp index on the filter and check the plan for the statement again, and it uses the index, but I need to kill the session or bounce the database for the new plan to be used because the currently running session is still using the old plan. Customer was desperate to get it down so I even tried flushing the shared pool but new plan does not get used. I did not want to restart the database. Do I have to kill all sessions using that plan for the new one to take effect?

Now im stuck. I don't really know how else to get an accurate picture of which sql is currently consuming CPU. I need to know this to try to fix the SQL because stopping the application is not an option, even if it is throwing too much at the database. The only other option I can think of is setting up profile limits for cpu consumption in oracle, but then the job will likely take an unacceptable amount of time for the customer.

So my questions are:

1. How else can I check which SQL is using CPU accurately?

2. Does the SQL I show above accuratlely display current average CPU over x minutes with ASH? Why is it so low for the top consumer but my CPU is still at 100% on the OS?

Thankyou for your help.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Mar 25 2009
Added on Feb 25 2009
5 comments
17,357 views