Skip to Main Content

Infrastructure Software

Solaris 11.3.19.5.0 got hung and Not able to generate Crash Dump in X4170 Server.

adahiyaNov 8 2017 — edited Nov 9 2017

Hi Team,

We are having Solaris 11.3.19.5.0 running in X4170 server, It always got hung in 15 days and there is no deviation appear at ILOM level but we were not able to login or ping server from network.

After a reboot everything is fine and there was no hung messages appeared in message file. Now customer wants RCA for multiple hung, we tried with NMI signal but crash dump was not generated and Solaris didn't panic.

After reboot server perfectly worked and there was no performance issue.

Is there any way to manually generate crash dump in X-86 servers or any bug hit in Solaris 11.3.19.5.0?

root@EMS-BGF-SERVER:~# pkg info entire

             Name: entire

          Summary: entire incorporation including Support Repository Update (Oracle Solaris 11.3.19.5.0).

      Description: This package constrains system package versions to the same

                   build.  WARNING: Proper system update and correct package

                   selection depend on the presence of this incorporation.

                   Removing this package will result in an unsupported system.

                   For more information see:

                   https://support.oracle.com/rs?type=doc&id=2045311.1

         Category: Meta Packages/Incorporations

            State: Installed

        Publisher: solaris

          Version: 0.5.11 (Oracle Solaris 11.3.19.5.0)

    Build Release: 5.11

           Branch: 0.175.3.19.0.5.0

   Packaging Date: Fri Apr 07 23:19:31 2017

             Size: 5.46 kB

             FMRI: pkg://solaris/entire@0.5.11,5.11-0.175.3.19.0.5.0:20170407T231931Z

root@EMS-BGF-SERVER:~#

root@EMS-BGF-SERVER:~# fmadm faulty

--------------- ------------------------------------  -------------- ---------

TIME            EVENT-ID                              MSG-ID         SEVERITY

--------------- ------------------------------------  -------------- ---------

Oct 28 13:24:42 5c058aff-82df-4b4d-b2dd-d14564f6f81f  USB-8000-80    Major   

Problem Status    : open

Diag Engine       : eft / 1.16

System

    Manufacturer  : unknown

    Name          : unknown

    Part_Number   : unknown

    Serial_Number : unknown

System Component

    Manufacturer  : SUN MICROSYSTEMS

    Name          : SUN FIRE X4170 SERVER         

    Part_Number   : 4442481-2            

    Serial_Number : 0935XF5054           

    Host_ID       : 000c169c

----------------------------------------

Suspect 1 of 1 :

   Problem class : fault.io.usb.dur

   Certainty   : 100%

   Affects     : dev:////pci@0,0/pci108e,4844@1a,7/hub@3

   Status      : faulted but still in service

   Resource

     FMRI             : "hc://:chassis-mfg=ORACLE-CORPORATI:chassis-name=SUN-FIRE-X4170-SERVER:chassis-part=To-Be-Filled-By-O.E.M.:chassis-serial=0935XF5054:fru-part=ff01-046b/motherboard=0/hostbridge=0/usb-bus=3/usbhub=3"

     Manufacturer     : unknown

     Name             : unknown

     Part_Number      : ff01-046b

     Revision         : unknown

     Serial_Number    : unknown

     Chassis

        Manufacturer  : ORACLE-CORPORATI

        Name          : SUN-FIRE-X4170-SERVER

        Part_Number   : To-Be-Filled-By-O.E.M.

        Serial_Number : 0935XF5054

     Status           : faulted but still in service

Description : The USB device detected that the end point returned less data

              than required resulting in a data underrun condition. The

              corresponding driver may not be able to recover from the errors

              automatically.

Response    : Device may have been disabled or may not be fully functional.

Impact      : Loss of services provided by the device instances associated with

              this fault.

Action      : Use 'fmadm faulty' to provide a more detailed view of this event.

              Please refer to the associated reference document at

              http://support.oracle.com/msg/USB-8000-80 for the latest service

              procedures and policies regarding this diagnosis.

Server rebooted on 7-11-17 at 22:04 PM.

Thanks

Ankit Dahiya

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Dec 7 2017
Added on Nov 8 2017
2 comments
549 views