Hi guys,
I have the following setup:
2 VMs running Solaris 11.2 x86 (a 3rd VM is iSCSI target and quorum server)
Solaris cluster 4.2
After cluster software install and initial setup, svc:/system/cluster/manager-glassfish3:default enters maintenance mode:
root@node2:~# svcs -xv
svc:/system/cluster/manager-glassfish3:default (SOLARIS CLUSTER MANAGER - GLASSFISH3)
State: maintenance since December 27, 2014 05:04:47 PM EET
Reason: Start method exited with $SMF_EXIT_ERR_FATAL.
See: http://support.oracle.com/msg/SMF-8000-KS
See: /var/svc/log/system-cluster-manager-glassfish3:default.log
Impact: 1 dependent service is not running:
svc:/system/cluster/manager:default
root@node2:~# cat /var/svc/log/system-cluster-manager-glassfish3:default.log
[ Dec 27 12:56:39 Enabled. ]
[ Dec 27 12:56:39 Executing start method ("/lib/svc/method/svc-cluster-manager-glassfish3 start"). ]
Waiting for domain1 to start ...............
Successfully started the domain : domain1
domain Location: /usr/cluster/lib/ClusterManager/glassfish3/glassfish/domains/domain1
Log File: /usr/cluster/lib/ClusterManager/glassfish3/glassfish/domains/domain1/logs/server.log
Admin Port: 4848
Command start-domain executed successfully.
[ Dec 27 12:56:59 Method "start" exited with status 0. ]
[ Dec 27 12:56:59 Rereading configuration. ]
[ Dec 27 12:56:59 Executing refresh method ("/lib/svc/method/svc-cluster-manager-glassfish3 refresh"). ]
[ Dec 27 13:07:03 Executing start method ("/lib/svc/method/svc-cluster-manager-glassfish3 start"). ]
There is a process already using the admin port 4848 -- it probably is another instance of a GlassFish server.
Command start-domain failed.
The Oracle Glassfish Server is already running or admin is not able to start the server.
[ Dec 27 13:07:06 Method "start" exited with status 95. ]
root@node2:~# cacaoadm status
default instance is ENABLED at system startup.
Smf monitoring process:
2412
2413
Uptime: 0 day(s), 2:14
root@node2:~# cacaoadm list-params | grep network
network-bind-address=0.0.0.0
root@node2:~# netstat -a | grep 4848
root@node2:~#
I tried multiple times to disable, enable, clear the failed service, but no luck. Cluster is operational, I have quorum (both shared disk and quorum server), I successfully created a resource group (with lh and hastorageplus resources) which switches to the other node when requested; but those 2 services are down. Consequently I cannot use clsetup to create a zone cluster (among other things).
Any ideas?
Much appreciated!