CRS version : 11.1.0.7
Platform : Solaris 10
Hi Volunteers,
Due to a hardware issue, our entire RAC cluster's image had to be taken and replicated on different machines ( 2 node RAC)
In the new Host,
crsctl check crs showing everything being healthy.
$ crsctl check crs
Cluster Synchronization Services appears healthy
Cluster Ready Services appears healthy
Event Manager appears healthy
But none of the resources were in ONLINE state in the crs_stat output.
The local filesystem hosting the ORACLE_HOME and CRS_HOME were getting filled up mysteriously. 2 to 3gb in one to 2 minutes !
I remember removing core dump files, .aud , .trc files from ASM, CRS and RDBMS directories.
When I tried to start ASM using sqlplus or using srvctl, I got the below error
srvctl start asm -n brcfrac214
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "brcfrac214", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "brcfrac214",
[CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.brcfrac214.ASM1.asm' has placement error.]]
Later Solaris Admin found out that , the hostname (for node1) was configured wrongly (misspelled in some config file ! )
Reverting back to correct hostname plus reboot fixed the issue.
1.
why was "crsctl check crs" output showing everything to be healthy despite a serious issue ?
2.
What would be first few things that you would look for when none of resources in crs_stat are online ?
3.
Should I be looking for any irregularities in the below ps -ef output?
ps -ef | grep "init\."
4.
Would the first sign of any major cluster issue is ASM not coming up?
Thank You.