Excessive TCP retransmissions / duplicate acks on X4200
807559Jul 23 2008 — edited Nov 10 2010Hi,
I've encountered a strange problem on an X4200 (SunOS bank 5.10 Generic_127128-11 i86pc i386 i86pc). I am seeing a huge number of TCP retransmissions and duplicate acks being sent to the remote host when transferring large amounts of data from remote host to X4200, monitored in snoop. Normally these would suggest packetloss to me however leaving a rolling ping running with similar sized packets as are being received produces no packetloss at all.
As I have a SPARC box to hand also (a V880, SunOS v880 5.9 Generic_117171-07 sun4u sparc SUNW,Sun-Fire-880) I tried the connection there too. The snoop output from this was totally clean of bad packets when performing the same process as the x86 server. Out of interest I also tried the same test on an X4200 running RedHat Enterprise Linux 4, update 4, and the snoop (well, tcpdump) output was largely clean - the only retransmissions seem to be as a result of the odd dropped packet.
The application binaries in question are as identical as possible on each server and they're routed through the same switch and router. I checked the TCP settings that I'd see as having a bearing on this (tcp_xmit_hiwat, tcp_max_buf, tcp_cwnd_max) and again these are identical on both servers.
I'm at a bit of a loss as to what's happening unless there's a foible of Solaris on the x86 platform that I've overlooked or there's a known problem with the hardware in the X4200 that I'm unable to find documented.
Thanks for any advice.