Structured approach to debugging performance bottlenecks for 3rd Party apps
973077Nov 7 2012 — edited Nov 8 2012Hi All,
I am facing a situation which I believe most App Support personnel and DBAs in IT organizations do, but I havent found a structured approach to solve the problem. I am hoping this thread can help filter and pull together the varied chunks of information out there in one place.
Here is the situation. I am avoiding making it too specific, as the idea is to identify a good approach that is repeatable in other scenarios.
We are in the process of implementing a solution using a third party application (SAP's BPC), which is sitting on an Oracle database. The application implementation team has some control on to use the application to design the solution, but no direct access to the underlying queries that the app generates. We are starting to find that as the underlying database usage size is increasing (from a couple of million to tens of mllions of records), the performance of certain operations is becoming very unpredictable. Sometimes an operation would go through, relatively fast while at other times it would get stuck for over an hour and then time-out.
In such situations it is a classic battle between the Oracle DBAs and the App implementation team to try and push the ball in each other's court to try and identify and "fix" the problem.
What in your opinion would be a structured approach between the two teams to help solve the problem? For each step of the approach, please also try and see if you can point to links which further dive into specifics of executing that step.
For example, one approach might be to ...
1. DBA team to find a way to identify specific Querios/DBOperations that are taking too long. (add references here)
2. App team to collaborate with the App manufacturer's support organization to see what design changes or parameters could alter the nature of queries being generated or affect the size of the underlying tables. (too specific for each 3rd party app)
3. After exhausing (2), DBA team to analyze the remaining culprit queries and find ways to obtain better performance without changing the query or the size of the database tables via indexes/DB parameters/etc.. (add references here)
4. After exhausing (3), DBA/Unix admin team to identify which specific hardware bottlenecks are being faced (CPUs/storage/memory) to see if hardware changes can help obtain better performance.
Thoughts?