Poor performance - bad end user response time
We have experienced severe performance problems when upgrading from Solaris
8 to 9, to such an extent that we had to roll back.
Environment:
Sun Fire 15K, MPO capable expander boards
36 CPUs (32 * 1200 MHz, 4 * 1050 MHz)
144GB memory (9*16GB)
Solaris 8 (2/02 based) KP 27
CPU utilization High 90s during peak hours (<1% idle). Queue often more than 36.
No memory contention
One other small domain on same 15K chassis.
EMC DMX3000 SRDF'd synchronously to EMC 8830
Approximately 3TB of Oracle database
EMC powerpath 3.0.4 controls 6 channels
Veritas DB edition 3 (VM 3.2, FS 3.4)
No I/O bottlenecks on disk
Network connections across two 2222A 100 Mb/s cards (ce driver)
Load split between cards about 80/20. Busiest NIC 85% busy.
Oracle 8i 8.1.7.4 32 bit
2 Instances, each about 5000 connections
Online applications mainly created in Magic 8 (running on user's PCs)
A few Magic 5 users.
Batch applications mainly written in C (virtually no batch during peak
hours)
Many 'component servers' also running Magic applications to do background processing during all hours
Typical end user response time 4 to 5 seconds
Change:
When we switch to Solaris 9 KP 11(built on a fresh install from 4/03 media),(With EMC PP 4 and Veritas DB edition 3.5) CPU utilization never reaches 100%, ie. everything looks fine from a sysadmins point of view.
However, end user response time goes up to 20 seconds. Batch jobs run fine.
Oracle shows SQL*Net waits.
So clearly something is waiting for some, probably logical, resource, but we do not know what.
(Oracle has been relinked, MPO enabled and disabled, Solaris 8 TCP tuning parameters tried, ce driver parameter changes tried)
Is there perhaps anyone out there who has had a similar experience or that has some idea of what the problem might be?
_______________________________________________