We refresh data from the production environment using the Scheduler tool (chains, programs, programs, etc.). It is probably not the best way, but I inherited the structure, so it is what it is. Views pull data across a database link, apply logic to the data (i.e. ontime pickup, ontime delivery), and populate the tables.
The procedures insert into temp tables bringing the data across a database link. Then records are deleted and inserted into the actual table based on the records in the temp table using the IN clause on the primary key. Some of the tables have several hundred columns (again inherited, not preferable) so using an UPDATE statement is not feasible. Below is an example of one of the procedures.
DATE_CUTOFF is the create date or update date of the record. NN_SHIPMENT_INFO_XA is the actual table. NN_SHIPMENT_INFO_XA_TEMP is the temp table.
NN_SHIPMENT_INFO_XA has 292 columns, so deleting and inserting 10k rows is taking almost two hours. I have read that IN is one of the most inefficient operators. I have tried gathering statistics on the temp table each time, but that has not helped much. Breaking the view into pieces and reducing the number of columns is proving difficult because there are several layers of logic and tables between the database link and the end user layer. Due to its complexity, I am committed to my predecessor's design for the foreseeable future. Is there a better way to construct this procedure to make it run faster?
All suggestions are greatly appreciated.