Error read/writng SMTP packet
807574Jan 17 2008 — edited Feb 11 2020Sun Java System Messaging Server 6.2-3.04
For a few days, I've had messages to remote domains queued up with one of two errors: "Error writing SMTP" packet or "Error reading SMTP packet; response to dot-stuffed message expected; likely problem with network or remote SMTP server." Most of our mail is being delivered to most domains, but some mail to some domains is being re-queued over and over. For instance, one message to domain.edu gets through fine; another is queued for days; a third was queued with one of the errors for an hour or two, and then is delivered. There's no pattern to the messages (size, attachments, senders, recipients).
master_debug on TCP_local shows (with a bit of snipping):
2:04:37.71: Sending : "MAIL FROM:<aaa@oursite.com> SIZE=4382720
12:04:37.84: Got status : "250 OK <aaa@oursite.com> Sender ok"
12:04:37.84: Sending : "RCPT TO:<bbb@theirsite.com>
12:04:37.88: Got status : "250 OK <bbb@theirsite.com> Recipient ok"
12:04:37.88: Sending : "DATA"
12:04:37.92: Got status : "354 Start mail input; end with <CRLF>.<CRLF>"
12:04:37.92: Write message header/body in one go
12:06:49.18: smtp_pmt_write: [0x00000024] network write failed
12:06:49.22: smtp_pmt_close: [0x00000024] status 0
********************************
Second type of log:
2:00:24.13: Sending : "MAIL FROM:<aaa@oursite.com> SIZE=25600
12:00:25.13: Got status : "250 2.1.0 <aaa@oursite.com>... Sender ok"
12:00:25.13: Sending : "RCPT TO:<bbb@theirsite.com>
12:00:26.52: Got status : "250 2.1.5 bbb@theirsite.com... Recipient ok"
12:00:26.52: Sending : "DATA"
12:00:26.79: Got status : "354 Start mail input; end with <CRLF>.<CRLF>"
12:00:26.79: Write message header/body in one go
12:00:26.80: ... Message header/body, 370 lines ...
12:00:26.80: Sending : "."
12:03:27.16: smtp_pmt_read: [0x00000024] network read failed
12:03:27.17: smtp_pmt_close: [0x00000024] status 0
From what I've read here and elsewhere, these failure indicate a network issue; we aren't receiving the expected ack, we time out, and re-queue.
Since I'm seeing this happening with both of our outbound servers, and multiple receiving domains, is it reasonable to start looking at our network and/or firewall? What else should I look at? Nothing is being dropped at the firewall, per our firewall guy.
Thanks for any help.