Linksys NIC not working with RHEL4 x86_64
An attempt to configure 2 nodes RAC on Linux x86_64 fails due to Linksys NIC problem.
I have the following 2 network adapters on 2 machines running RHEL4 - 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:13:42 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
# lspci -v
eth1
05:09.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11)
Subsystem: Linksys: Unknown device 0574
Flags: bus master, medium devsel, latency 32, IRQ 201
I/O ports at 1100 [size=256]
Memory at e0a01000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [c0] Power Management version 2
eth0
3f:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5752 Gigabit Ethernet PCI Express (rev 01)
Subsystem: Hewlett-Packard Company: Unknown device 3011
Flags: bus master, fast devsel, latency 0, IRQ 233
Memory at e0500000 (64-bit, non-prefetchable) [size=64K]
Capabilities: [48] Power Management version 2
Capabilities: [50] Vital Product Data
Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable+
Capabilities: [d0] Express Endpoint IRQ 0
Capabilities: [100] Advanced Error Reporting
Capabilities: [13c] Virtual Channel
They're connected to a Netgear GS108 8-Port 10/100/1000 Copper Gigabit Ethernet Switch.
[root@plbde-nashua1 devices]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:14:BF:53:B8:B9
inet addr:192.168.1.100 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::214:bfff:fe53:b8b9/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:280028 errors:0 dropped:0 overruns:0 frame:0
TX packets:914 errors:5 dropped:0 overruns:0 carrier:5
collisions:0 txqueuelen:1000
RX bytes:21224432 (20.2 MiB) TX bytes:82508 (80.5 KiB)
Interrupt:201 Base address:0x4000
eth1 Link encap:Ethernet HWaddr 00:13:21:EB:84:D4
inet addr:138.1.17.243 Bcast:138.1.19.255 Mask:255.255.252.0
inet6 addr: fe80::213:21ff:feeb:84d4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:12908 errors:0 dropped:0 overruns:0 frame:0
TX packets:362 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:948009 (925.7 KiB) TX bytes:64346 (62.8 KiB)
Interrupt:185
eth0 used for private (local) network using Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) - Model: LNE100TX
eth1 used for public network using Broadcom Corporation NetXtreme BCM5752 Gigabit Ethernet PCI Express (rev 01)
If I deactivate eth0 or disconnect them from the switch then I can successfully ping from/to both machines via public network eth1.
However, if I plug back eth0 then it looks like it messes up everything and this is what happens:
1- I can no longer ping from both machine1 to machine2. Ping hangs and strace shows:
sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("138.1.17.252")}, msg_iov(1)=[{"\10\0\263\323\2547\0\3\262\236/F\0\0\0\0\3549\v\0\0\0\0"..., 64}], msg_controllen=0, msg_flags=0}, 0) = 64
recvmsg(3, 0x7fbfffe680, 0) = -1 EAGAIN (Resource temporarily unavailable)
2- I can ping from machine2 to machine1 but it shows the following output:
64 bytes from plbde-nashua1.us.oracle.com (138.1.17.243): icmp_seq=3 ttl=64 time=0.086 ms
wrong data byte #30 should be 0x1e but was 0x28
#16 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 28 4 40 0 28 4 40 0 28 4 40 0 1e 1f 20 21 1e 1f
#48 20 21 1e 1f 20 21 1e 1f
Then I unplugged the public interfaces eth1 from the switch to test it only with the private network. I am unable to ping one another! Ping hangs and strace shows the following:
sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.1.101")}, msg_iov(1)=[{"\10\0\354\356\256J\0\3~\0262F\0\0\0\0\353\223\2\0\0\0\0"..., 64}], msg_controllen=0, msg_flags=0}, 0) = 64
recvmsg(3, 0x7fbfffe500, 0) = -1 EAGAIN (Resource temporarily unavailable)
Does anyone have any idea of why this is happening?
Could this be a bug or incompatibily between RHEL4 and Linksys LNE100TX NIC?
Any comments are welcome.