Solaris 10 SNMP Inconsistent
807557May 19 2009 — edited May 22 2009Hi,
We are using SunOS 5.10 Generic_137138-09 i86pc i386 i86pc on 2 X4600's. Both of them seem to have an issue with snmp reporting running processes and send our ops monitoring (Zenoss) into a tailspin reporting processes as down when they are in fact up.
Here's an example:
root@ug1s02:/ # ps -ef|grep slap
root 17960 843 0 12:56:42 pts/11 0:00 grep slap
root 16992 1 0 May 10 ? 17:00 /opt/DSServers/ds6/lib/64/ns-slapd -D /opt/DSServers/slapd-ug1s02-zone3 -i /opt
root@ug1s02:/ # snmpwalk -c public -v 2c localhost |grep slap
HOST-RESOURCES-MIB::hrSWRunName.16992 = STRING: "ns-slapd"
HOST-RESOURCES-MIB::hrSWRunPath.16992 = STRING: "/opt/DSServers/ds6/lib/64/ns-slapd"
HOST-RESOURCES-MIB::hrSWRunParameters.16992 = STRING: "-D /opt/DSServers/slapd-ug1s02-zone3 -i /opt"
root@ug1s02:/ # snmpwalk -c public -v 2c localhost |grep slap
root@ug1s02:/ # ps -ef|grep slap
root 17960 843 0 12:56:42 pts/11 0:00 grep slap
root 16992 1 0 May 10 ? 17:00 /opt/DSServers/ds6/lib/64/ns-slapd -D /opt/DSServers/slapd-ug1s02-zone3 -i /opt
See how ns-slapd is running, and a snmpwalk shows the process name, path and parameters for the PID 16992. Then i run it again a few seconds later and it doesn't show this process in the snmp data, but you can see the process is still fine. It seems to randomly show the process in snmp. Not showing it around 10% of the time. When the grep returns nothing, its not that the snmpwalk is failing or anything as other process info is returned. Also, its only seems to happen to processes running in the non-global zone.
There are no errors or anything in /var/log/snmpd.log and the snmp conf fuile is completely standard.
Any help or suggestions would be great.
Thanks
Paul