Thread: Need to Improve performace of query

This question is answered. Helpful answers available: 5. Correct answers available: 1.


Permlink Replies: 6 - Pages: 1 - Last Post: Nov 18, 2009 9:07 PM Last Post By: user552703
user552703

Posts: 9
Registered: 01/11/07
Need to Improve performace of query
Posted: Nov 18, 2009 5:48 AM
 
Click to report abuse...   Click to reply to this thread Reply
Hi,
I have 2 tables TAB1 and TAB2. TAB1 has around 55,000,000 rows and TAB2 has aound 150,000 rows.
I am inserting all the rows from TAB2 to TAB1 and I am using following query

MERGE INTO TAB1 TRG
USING (SELECT COL1,COL2.....
FROM TAB2 ) SRC
ON (TRG.COL1 = SRC.COL1
AND TRG.COL2 = SRC.COL2
AND TRG.COL3 = SRC.COL3)
WHEN NOT MATCHED THEN
INSERT (TRG.COL1,TRG.COL2 ....)
VALUES (SRC.COL1,SRC.COL2 ....);

Following is the explain plan for this statement
Plan
MERGE STATEMENT ALL_ROWS Cost: 14,380 Bytes: 81,668,417 Cardinality: 143,027
6 MERGE TAB2
5 VIEW
4 HASH JOIN OUTER Cost: 14,380 Bytes: 32,324,102 Cardinality: 143,027
1 TABLE ACCESS FULL TABLE TAB1 Cost: 304 Bytes: 14,588,754 Cardinality: 143,027
3 TABLE ACCESS BY GLOBAL INDEX ROWID TABLE TAB2 Cost: 12,999 Bytes: 69,025,716 Cardinality: 556,659 Partition #: 5
2 INDEX RANGE SCAN INDEX IDX2 Cost: 1,551 Cardinality: 573,852

There are 2 indexes in TAB1
1. IDX1 on col1 and col2
2. IDX2 on col3

There is one index in TAB2
1.IDX1 on col1 and col2

TAB2 is truncated and populated with rows every 2-3 days.

This process takes too much time to complete.

Is there any other way to improve the performace of this process(inserting data from TAB2 to TAB1)?
or Is there any way other than using MERGE to improve performace?

Thanks.
riedelme

Posts: 1,792
Registered: 04/03/02
Re: Need to Improve performace of query
Posted: Nov 18, 2009 7:14 AM   in response to: user552703 in response to: user552703
 
Click to report abuse...   Click to reply to this thread Reply
Tuning MERGE statements is hard :(. Some observations:

You're doing a full table scan on table1. There are no filters to restrict the number of rows read so this is probably necessary. An indexed lookup is taking place on table2 which is probably helping performance. This is taking place on partitioned tables (global index).

Global indexes can help partiion queries when joins cross partitions. if that is not the case using partitinoned indexes to allow pruning could be more effective.

In my experience MERGE tends to be less effective when inserting data than updating with the occasional insert. That is, MERGE can be a bad choice when mostly inserting data due to the overhead of checking to see if a row exists before inserting when it does not. Another approach in your situation might offer better performance
* select loop with insert/update logic (probably slower, but sometimes is faster)
* insert table as select with WHERE clause subquery avoiding rows that are there
update with WHERE clause subquery to update rows that were there
damorgan

Posts: 9,443
Registered: 10/20/03
Re: Need to Improve performace of query
Posted: Nov 18, 2009 7:24 AM   in response to: user552703 in response to: user552703
 
Click to report abuse...   Click to reply to this thread Reply
No version number and no, what you posted, is not an explain plan. It looks like something created by toad and that is wholly inadequate for serious examination.

If what you are doing is straight inserts I would dump the MERGE and use the APPEND hint. If there is something else going on then describe it in detail.

What does "too long" mean? Provide metrics. Is it taking 5 seconds or 5 hours or 5 days? What is the target time you are trying to achieve.

Also what is the operating system and the storage solution? Is this on a single "C" drive or a well design striped and mirrored SAN with a tune read and write cache?

In the future create your explain plans using DBMS_XPLAN.
http://www.morganslibrary.org/library.html
RDB

Posts: 45
Registered: 07/17/07
Re: Need to Improve performace of query
Posted: Nov 18, 2009 7:36 AM   in response to: user552703 in response to: user552703
 
Click to report abuse...   Click to reply to this thread Reply
It looks like COL1,COL2,COL3 are primary keys for both tables? If this is the case at least a unique index should help. In addition rewriting to just an insert statement might help:

INSERT INTO TAB1
SELECT COL1, COL2, ....
FROM TAB2
WHERE (COL1, COL2, COL3)
IN (SELECT COL1, COL2, COL3 FROM TAB2
MINUS
SELECT COL1, COL2, COL3 FROM TAB1)

Or possibly with extra where clause for the large table may be faster,

INSERT INTO TAB1
SELECT COL1, COL2, ....
FROM TAB2
WHERE (COL1, COL2, COL3)
IN (SELECT COL1, COL2, COL3 FROM TAB2
MINUS
SELECT COL1, COL2, COL3 FROM TAB1
WHERE (COL1, COL2, COL3) IN (SELECT COL1, COL2, COL3 FROM TAB2))

For bulk loading using the INSERT /*+ APPEND */ INTO ... hint might be worth reading up on too.
David_Aldridge

Posts: 595
Registered: 04/22/08
Re: Need to Improve performace of query
Posted: Nov 18, 2009 8:45 AM   in response to: user552703 in response to: user552703
 
Click to report abuse...   Click to reply to this thread Reply
The trick here would be to optimise the hash outer join, and I have a couple of suggestions:

Firstly, you could do this by hash partitioning (or subpartitioning) on the join keys. Partition-wise joins would reduce the chances of the join spilling to disk.

Secondly, you might be able to introduce a redundant predicate into the source query that can be used to partition prune on the target table.

The latter option works in a situation where you are merging a set of data into a table that has a limited range of values on the join key, and a classic example of this would be in maintaining an aggregate table in a data warehouse by merging in recent data only. The aggregate table might be range partitioned on date_of_transaction and cover a range of values over ten years, but the new data might just cover the most recent week. In that case you would like to be able to place a predicate directly on the target table to say "I'm only going to find a join on this range of rows". You can't do that directly but the optimiser can infer it if you place a redundant predicate in the USING clause.

Demonstration script:

create table src (col1 number, col2 number);
 
insert into src 
select 2, rownum
from   dual
connect by rownum < 10000
/ 
 
 
create table tgt (col1 number, col2 number)
partition by range (col1)
(
partition p1 values less than (2),
partition p2 values less than (3)
)
;
 
insert into tgt
select 1, rownum
from   dual
connect by rownum < 100000
/ 
 
commit;
exec dbms_stats.gather_table_stats(user,'src');
exec dbms_stats.gather_table_stats(user,'tgt');


Table tgt only has values in partition P1, but the merge is only going to populate partition P2.

explain plan for
merge into tgt
using (select * from src) src
on (tgt.col1 = src.col1)
when matched then update
set col2 = src.col2
when not matched then insert
values (src.col1, src.col2)
/ 
 
select * from table(dbms_xplan.display)
/ 
 
explain plan succeeded.
PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
Plan hash value: 3718868795                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                             
-----------------------------------------------------------------------------------------------                                                                                                                                                                                                              
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |                                                                                                                                                                                                              
-----------------------------------------------------------------------------------------------                                                                                                                                                                                                              
|   0 | MERGE STATEMENT        |      |  9999 |   449K|    64   (5)| 00:00:01 |       |       |                                                                                                                                                                                                              
|   1 |  MERGE                 | TGT  |       |       |            |          |       |       |                                                                                                                                                                                                              
|   2 |   VIEW                 |      |       |       |            |          |       |       |                                                                                                                                                                                                              
|*  3 |    HASH JOIN OUTER     |      |  9999 |   126K|    64   (5)| 00:00:01 |       |       |                                                                                                                                                                                                              
|   4 |     TABLE ACCESS FULL  | SRC  |  9999 | 59994 |     6   (0)| 00:00:01 |       |       |                                                                                                                                                                                                              
|   5 |     PARTITION RANGE ALL|      | 99999 |   683K|    56   (2)| 00:00:01 |     1 |     2 |                                                                                                                                                                                                              
|   6 |      TABLE ACCESS FULL | TGT  | 99999 |   683K|    56   (2)| 00:00:01 |     1 |     2 |                                                                                                                                                                                                              
-----------------------------------------------------------------------------------------------                                                                                                                                                                                                              
                                                                                                                                                                                                                                                                                                             
Predicate Information (identified by operation id):                                                                                                                                                                                                                                                          
---------------------------------------------------                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                                             
   3 - access("TGT"."COL1"(+)="SRC"."COL1")                                                                                                                                                                                                                                                                  
 
18 rows selected


So you see from the above that a full scan of both partitions of TGT is performed.

We introduce a redundant predicate into the USING clause:

explain plan for
merge into tgt
using (select * from src where col1 >= 2) src
on (tgt.col1 = src.col1)
when matched then update
set col2 = src.col2
when not matched then insert
values (src.col1, src.col2)
/ 
 
select * from table(dbms_xplan.display)
/ 
 
explain plan succeeded.
PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
Plan hash value: 2500172128                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                             
--------------------------------------------------------------------------------------------------                                                                                                                                                                                                           
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |                                                                                                                                                                                                           
--------------------------------------------------------------------------------------------------                                                                                                                                                                                                           
|   0 | MERGE STATEMENT           |      |  9999 |   449K|     9  (12)| 00:00:01 |       |       |                                                                                                                                                                                                           
|   1 |  MERGE                    | TGT  |       |       |            |          |       |       |                                                                                                                                                                                                           
|   2 |   VIEW                    |      |       |       |            |          |       |       |                                                                                                                                                                                                           
|*  3 |    HASH JOIN RIGHT OUTER  |      |  9999 |   429K|     9  (12)| 00:00:01 |       |       |                                                                                                                                                                                                           
|   4 |     PARTITION RANGE SINGLE|      |     1 |    38 |     2   (0)| 00:00:01 |     2 |     2 |                                                                                                                                                                                                           
|   5 |      TABLE ACCESS FULL    | TGT  |     1 |    38 |     2   (0)| 00:00:01 |     2 |     2 |                                                                                                                                                                                                           
|*  6 |     TABLE ACCESS FULL     | SRC  |  9999 | 59994 |     6   (0)| 00:00:01 |       |       |                                                                                                                                                                                                           
--------------------------------------------------------------------------------------------------                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                                                             
Predicate Information (identified by operation id):                                                                                                                                                                                                                                                          
---------------------------------------------------                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                                             
   3 - access("TGT"."COL1"(+)="SRC"."COL1")                                                                                                                                                                                                                                                                  
   6 - filter("COL1">=2)                                                                                                                                                                                                                                                                                     
 
19 rows selected


You see from the plan that only one partition of TGT is now being scanned, and as that is empty in this case it will be a very fast action.

Even without partition pruning this could well be more efficient than the "vanilla" alternative by giving the optimiser more information about the size of the subset of TGT rows that are being merged into and giving the opportunity of a more efficient path for reading them.
user552703

Posts: 9
Registered: 01/11/07
Re: Need to Improve performace of query
Posted: Nov 18, 2009 8:54 PM   in response to: damorgan in response to: damorgan
 
Click to report abuse...   Click to reply to this thread Reply
Hi Daniel,

I apologize for the insufficient information. Next time when I will post any question I will try to provide all the information I have.

Thanks

user552703

Posts: 9
Registered: 01/11/07
Re: Need to Improve performace of query
Posted: Nov 18, 2009 9:07 PM   in response to: user552703 in response to: user552703
 
Click to report abuse...   Click to reply to this thread Reply
Thanks a lot all.....
Legend
Guru Guru : 2500 - 1000000 pts
Expert Expert : 1000 - 2499 pts
Pro Pro : 500 - 999 pts
Journeyman Journeyman : 200 - 499 pts
Newbie Newbie : 0 - 199 pts
Oracle ACE Director
Oracle ACE Member
Oracle Employee ACE
Helpful Answer (5 pts)
Correct Answer (10 pts)

Point your RSS reader here for a feed of the latest messages in all forums