Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Unusual long execution time for a MERGE statement

2681027Oct 29 2016 — edited Oct 29 2016

Hi guys,

I’m facing a strange performance issue and I really don’t know what to do anymore...

First I would like to give you some short background info. I’m working on a new BI Reporting System regarding the adoption/penetration of a company specific version of OS, which is called OBI (it’s basically and OS image, customized for our company’s needs). Right now exists some basic reports (in BI) based on some pre-aggregated queries, but there is a requirement to be developed as a proper Reporting system (with its own Star Schema & ETL & Repository & Subject Area).

My issue is on a particular loading procedure... even though the SELECT which give me the data to be inserted into this table it takes less than 1 second, the loading process takes forever. So I will stay only in this part of the tables & configuration info.

In the source database, there are 4 tables:

  • OPERATING_SYSTEM
  • HARDWARE
  • INSTANCES
  • INSTANCES_HARDWARE

The OPERATING_SYSTEM table contains data regarding operating systems that can be installed on our machines.

The HARDWARE table contains data regarding each hardware model (for example, a laptop Lenovo T440P has just 1 record, even though we have hundred of them in the company).

The INSTANCES contains data regarding the instances of the OS installation (among them we have the OS version, the build version of the OS and so on...)

The INSTANCES_HARDWARE table is a bridge table between HARDWARE and INSTANCES and contains the info at the physical machine level (contains the system_tag, which is unique for each equipment, the CPU/Memory information and so on...).

In my Star Schema I designed it in this way:

  • OP_SYSTEM_DIMENSION: contains info regarding the operating systems that can be installed on our machine
  • HARDWARE_MODEL_DIMENSION: contains info at the model level (Lenovo T440P will be only 1 record). Has HAR_ID as Primary Key.
  • HARDWARE_DIMENSION: will contain info for each equipment. Has HAR_KEY as Primary Key (generated by a trigger/sequence. Also will have HAR_ID as Foreign_Key to the HARDWARE_MODEL_DIMENSION table.
  • INSTANCES_DIMENSION: will contain info regarding the instances. It has INS_ID as Primary_Key. Also will have HAR_KEY as Foreign_Key to the HARDWARE_DIMENSION table and OS_ID as Foreign_Key to the OP_SYSTEM_DIMENSION table.
  • - ...
  • - ...

The report will be at the month level, but the DATE is defined in another table (let’s call it USAGE in the original schema and USAGE_DIMENSION in my star schema), which contains info about the instance (hardware, op_system) and the user which did the installation, when he did it. Date which is not always the same as in the INSTANCES => I cannot use the partitioned technique (truncate the month partition and reload the month info). The INSTANCES_DIMENSION table is partitioned, but is not relevant from the execution point of view. The alternative was to use MERGE, which I did.

First thing first:

SELECT * FROM v$version;

  

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production

NLSRTL Version 11.2.0.4.0 - Production

Then, the table definition: see the attached CT_SQL.txt

Now, my MERGE statement:

   MERGE INTO INSTANCES_DIMENSION d

  USING (SELECT ins.ins_id, ih.ih_id, ins.os_os_id, ih.har_har_id, h.har_key, ih.ethernet_address, ins.machine_name, ih.system_tag, ih.bios_revision,

                ih.bios_date, ih.bios_machine_model, ins.status, ins.install_date, ins.created_date, ins.bvl_bvl_id, ins.os_service_pack, ins.client_version

           FROM INSTANCES ins

           JOIN INSTANCES_HARDWARE ih ON (ins.ih_ih_id = ih.ih_id AND ins.created_date = ih.created_date)

           JOIN HARDWARE_DIMENSIONS h ON (ih.har_har_id = h.har_id AND ih.system_tag = h.system_tag)) s

     ON (d.ins_id = s.ins_id)

   WHEN NOT MATCHED THEN INSERT (d.ins_id, d.ih_ih_id, d.os_id, d.har_id, d.har_key, d.ethernet_address, d.machine_name, d.system_tag, d.bios_revision,

                                 d.bios_date, d.bios_machine_model, d.status, d.install_date, d.created_date, d.bvl_bvl_id, d.os_service_pack, d.client_version)

                         VALUES (s.ins_id, s.ih_id, s.os_os_id, s.har_har_id, s.har_key, s.ethernet_address, s.machine_name, s.system_tag, s.bios_revision,

                                 s.bios_date, s.bios_machine_model, s.status, s.install_date, s.created_date, s.bvl_bvl_id, s.os_service_pack, s.client_version);

Which it takes forever (after 2 hours I stopped it). I don’t know why it takes so long.

INSTANCES table has around 1,360,000 records

INSTANCES_HARDWARE table has around 987,000 records

HARDWARE_DIMENSIONS table has around 987,000 records (the same numbers as the INSTANCES_HARDWARE table)

The relation between INSTANCES_HARDWARE and INSTANCES is 1 to MANY => this is the reason we have more records in INSTANCES table than in INSTANCES_HARDWARE. Whenever an instance is created, is inserted in both table (which one with its own information), but the following sessions on the same hardware is inserted only in INSTANCES table... in INSTANCES_HARDWARE we will have only the first one.

The SELECT itself it takes less than 1 second to be executed... why the MERGE it takes forever?

I attach the execution plan for the MERGE statement (explain_MERGE.html) and the one just for the SELECT (explain_SELECT.html).

Now, please tell me what I'm doing wrong, because I really don't know what to do anymore. This is not my first ETL or my first project... I have others by far more complicated (I'm talking about the INSERT or MERGE statements), involving more than 3 lousy tables and which contains many millions record each, but non of them give me such a headache like this.

Sorry if missed anything... Unfortunately I'm now able to provide you any sample data due the privacy reason, but I believe is pretty clear what kind of data am I talking about.

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Nov 26 2016
Added on Oct 29 2016
1 comment
343 views