Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

LACP PDU Timeout

DHeliosAug 19 2016 — edited Aug 22 2016

Hi, I have problem with Solaris 11.3 LAG technology and Juniper QFX5100 switches.

Sometimes solaris does not respond to LACP PDU packets in time and link aggregation become unavailable.

Solaris configuration:

The physical processor has 14 cores and 28 virtual processors (0-13,28-41)

x86 (GenuineIntel 306F2 family 6 model 63 step 2 clock 2600 MHz)

  Intel(r) Xeon(r) CPU E5-2697 v3 @ 2.60GHz

The physical processor has 14 cores and 28 virtual processors (14-27,42-55)

x86 (GenuineIntel 306F2 family 6 model 63 step 2 clock 2600 MHz)

  Intel(r) Xeon(r) CPU E5-2697 v3 @ 2.60GHz

root@srv-da-zfs-02:~# dladm show-aggr -x aggr3

LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE

aggr3 -- 10000Mb full up 90:e2:ba:86:54:c0 --

       net6           10000Mb full   up        90:e2:ba:86:54:c0  attached

       net8           10000Mb full   up        90:e2:ba:86:5b:28  attached

root@srv-da-zfs-02:/etc/driver/drv# dladm show-aggr -L aggr3

LINK PORT AGGREGATABLE SYNC COLL DIST DEFAULTED EXPIRED

aggr3 net8 yes yes yes yes no no

-- net6 yes yes yes yes no no

root@srv-da-zfs-02:/etc/driver/drv# dladm show-aggr -P aggr3

LINK MODE POLICY ADDRPOLICY LACPACTIVITY LACPTIMER

aggr3 trunk L3,L4 auto passive short

root@srv-da-zfs-02:/etc/driver/drv# dladm show-vlan vlan616

LINK VID SVID PVLAN-TYPE FLAGS OVER

vlan616 616 -- -- ----- aggr3

root@srv-da-zfs-02:~# ipmpstat -a

ADDRESS STATE GROUP INBOUND OUTBOUND

:: down sc_ipmp3 -- --

zclu01-616-1 up sc_ipmp3 vlan616 vlan616

root@srv-da-zfs-02:~# kstat -p | grep alloc_fail | ggrep -v '0$'

root@srv-da-zfs-02:~#

Juniper Configuration

Switches uses MC-LAG technology.

jpqfx5100-1-sdn> show forwarding-options enhanced-hash-key

Slot 0

Current RTAG7 Settings

----------------------

Hash-Mode : layer2-payload

inet RTAG7 settings-


inet packet fields

   Protocol                        : Yes

   Destination L4 Port             : Yes

   Source L4 Port                  : Yes

   Destination IPv4 Addr           : Yes

   Source IPv4 Addr                : Yes

   Vlan id                         : No

jpqfx5100-1-sdn> show configuration interfaces ae2

description da-zfs02;

mtu 9216;

aggregated-ether-options {

minimum-links 1;

link-speed 10g;

lacp {

    active;

    periodic fast;

    system-id 00:01:02:03:04:05;

    admin-key 4;

}

mc-ae {

    mc-ae-id 2;

    chassis-id 0;

    mode active-active;

    status-control active;

    init-delay-time 2;

}

}

unit 0 {

family ethernet-switching {

    interface-mode trunk;

    vlan {

        members SDN\_DC\_STORE\_DVLP;

    }

}

}

jpqfx5100-1-sdn> show configuration interfaces xe-0/0/20

description ae2;

ether-options {

802.3ad ae2;

}

ERRORS

Juniper side

Aug 18 04:12:18.555 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:18.554 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:18.680 2016 jpqfx5100-1-sdn lacpd[1321]: LACPD_TIMEOUT: xe-0/0/20: lacp current while timer expired current Receive State: CURRENT

Aug 18 04:12:18.684 2016 jpqfx5100-1-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - ATTACHED state - acting as standby link

Aug 18 04:12:18.684 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:18.682 2016 jpqfx5100-1-sdn lacpd[1321]: LACP_INTF_DOWN: ae2: Interface marked down due to lacp timeout on member xe-0/0/20

Aug 18 04:12:18.683 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:18.684 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:18.684 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:19.274 2016 jpqfx5100-1-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - CD state - ready to carry traffic

Aug 18 04:12:19.274 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:19.269 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:19.275 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:19.275 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:19.308 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:19.310 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:26.278 2016 jpqfx5100-1-sdn lacpd[1321]: LACPD_TIMEOUT: xe-0/0/20: lacp current while timer expired current Receive State: CURRENT

Aug 18 04:12:26.283 2016 jpqfx5100-1-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - ATTACHED state - acting as standby link

Aug 18 04:12:26.283 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:26.280 2016 jpqfx5100-1-sdn lacpd[1321]: LACP_INTF_DOWN: ae2: Interface marked down due to lacp timeout on member xe-0/0/20

Aug 18 04:12:26.281 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:26.282 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:26.282 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:26.336 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:26.334 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:29.334 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:29.338 2016 jpqfx5100-1-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - DETACHED state - will not carry traffic

Aug 18 04:12:29.338 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:29.335 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:29.335 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:30.553 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:30.553 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:30.571 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:30.569 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:31.542 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:31.543 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:32.559 2016 jpqfx5100-1-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - ATTACHED state - acting as standby link

Aug 18 04:12:32.559 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:32.559 2016 jpqfx5100-1-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - CD state - ready to carry traffic

Aug 18 04:12:32.559 2016 jpqfx5100-1-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:32.554 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:32.555 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:32.555 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:32.558 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:32.559 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:32.559 2016 jpqfx5100-1-sdn mcsnoopd[1352]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:32.565 2016 jpqfx5100-1-sdn rpd[1327]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:32.566 2016 jpqfx5100-1-sdn mcsnoopd[1352]: Decode ifd xe-0/0/20 index 741: ifdm_flags 0xc000

Aug 18 04:12:18.512 2016 jpqfx5100-2-sdn lacpd[2664]: LACPD_TIMEOUT: xe-0/0/20: lacp current while timer expired current Receive State: CURRENT

Aug 18 04:12:18.516 2016 jpqfx5100-2-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - ATTACHED state - acting as standby link

Aug 18 04:12:18.516 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:18.514 2016 jpqfx5100-2-sdn lacpd[2664]: LACP_INTF_DOWN: ae2: Interface marked down due to lacp timeout on member xe-0/0/20

Aug 18 04:12:18.515 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:18.515 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:18.515 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:18.727 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:18.729 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:19.271 2016 jpqfx5100-2-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - CD state - ready to carry traffic

Aug 18 04:12:19.271 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:19.270 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:19.277 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:19.277 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:19.306 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:19.305 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:26.287 2016 jpqfx5100-2-sdn lacpd[2664]: LACPD_TIMEOUT: xe-0/0/20: lacp current while timer expired current Receive State: CURRENT

Aug 18 04:12:26.291 2016 jpqfx5100-2-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - ATTACHED state - acting as standby link

Aug 18 04:12:26.291 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:26.289 2016 jpqfx5100-2-sdn lacpd[2664]: LACP_INTF_DOWN: ae2: Interface marked down due to lacp timeout on member xe-0/0/20

Aug 18 04:12:26.320 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:26.290 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:26.290 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:26.290 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:26.318 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:29.291 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:29.294 2016 jpqfx5100-2-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - DETACHED state - will not carry traffic

Aug 18 04:12:29.294 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:29.291 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 0

Aug 18 04:12:29.291 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:30.554 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:30.554 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:30.583 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:30.582 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:31.542 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:31.542 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:32.560 2016 jpqfx5100-2-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - ATTACHED state - acting as standby link

Aug 18 04:12:32.560 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:32.560 2016 jpqfx5100-2-sdn /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd xe-0/0/20 - CD state - ready to carry traffic

Aug 18 04:12:32.560 2016 jpqfx5100-2-sdn /kernel: if_pfe_mcae_color_pfe_update: xe-0/0/20.0: need to send mcae color to pfe

Aug 18 04:12:32.555 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:32.555 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:32.555 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:32.558 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:32.558 2016 jpqfx5100-2-sdn mcsnoopd[1353]: krt_decode_iflogical: xe-0/0/20.0 has got color 2

Aug 18 04:12:32.559 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:32.568 2016 jpqfx5100-2-sdn mcsnoopd[1353]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Aug 18 04:12:32.568 2016 jpqfx5100-2-sdn rpd[1327]: Decode ifd xe-0/0/20 index 740: ifdm_flags 0xc000

Solaris side

root@srv-da-zfs-02:/etc/driver/drv# egrep "mpath|mac:" /var/adm/messages

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru mac: [ID 486395 kern.info] NOTICE: aggr2 link down

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru mac: [ID 486395 kern.info] NOTICE: vlan612 link down

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 215189 daemon.error] The link has gone down on vlan612

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 773107 daemon.error] All IP interfaces in group sc_ipmp0 are now unusable

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru mac: [ID 486395 kern.info] NOTICE: aggr3 link down

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru mac: [ID 486395 kern.info] NOTICE: vlan616 link down

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 215189 daemon.error] The link has gone down on vlan616

Aug 18 04:12:30 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 773107 daemon.error] All IP interfaces in group sc_ipmp3 are now unusable

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru mac: [ID 435574 kern.info] NOTICE: aggr3 link up, 10000 Mbps, full duplex

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru mac: [ID 435574 kern.info] NOTICE: vlan616 link up, 10000 Mbps, unknown duplex

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 820239 daemon.error] The link has come up on vlan616

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 561795 daemon.error] At least 1 IP interface (vlan616) in group sc_ipmp3 is now usable

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru mac: [ID 435574 kern.info] NOTICE: aggr2 link up, 10000 Mbps, full duplex

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru mac: [ID 435574 kern.info] NOTICE: vlan612 link up, 10000 Mbps, unknown duplex

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 820239 daemon.error] The link has come up on vlan612

Aug 18 04:12:32 srv-da-zfs-02.net.billing.ru in.mpathd[115]: [ID 561795 daemon.error] At least 1 IP interface (vlan612) in group sc_ipmp0 is now usable

LOAD AVG 1min when LACP TIMEOUT appeared. It only happend when system in hight load.

LACP_TIMEOUT01.PNG

Network Traffic

LACP_TIMEOUT02.PNG

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Sep 19 2016
Added on Aug 19 2016
1 comment
4,024 views