Hi All,
We have batch process (in informatica) which runs about 30 hrs to process 35 million records with 250 odd columns comes from the Select query with the distinct clause having muti table joins.
As per our log, the first row returned from database after 7 hours and informatica started inserting records into target.
Just to make better understanding.
Starting time of the mapping : 1 AM
First record fetched from DB : 7 AM
Target Insertion started : 8 AM (1 hr delay due to informatica trying to create look up files)
Finish time : 7 AM (Next Day).
My doubt here is that
1) As we have distinct clause , I hope ORACLE returns the first row after processing all the rows and keep it the buffer or temp files before sending the first record to informatica.
That means that ORACLE is having all the records available to it, once we got the first record for the process. Is my assumption right ?, if not can you please elaborate how this works.
2) If my assumption mentioned in the point number 1 is right then, We are seeing the fluctuation in the number of records getting processed per hour. For example from 7 Am to 10 AM it is processing
3 million records per hour , in the next 4 hrs it is processing only 0.3 million records per hour and then again it is going back to 3 million records per hours.
My question here is that as ORACLE has done all the process and have records ready, simply it needs to pass the processed records to informatica, Ideally there should not be any fluctuation
as ORACLE do not have any thing to do apart from passing the records to Informatica. What could be the reason for this?
3) I would like to put a trending on the CPU Usage,I/O Usage and Memory usage of this ORACLE session on 1 min interval basis. Can some one suggest a better idea to do this ?
NOTE : Query is having PARALLEL clause.