Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Debugging Freezes of Apache 2.4 with TCP Sessions in CLOSE_WAIT State

Steffen MoserJul 21 2018 — edited Aug 14 2018

Hi all,

on our workgroup server, we run Nextcloud 13.0.4, PHP 7.1.17, Apache 2.4.33 in a non-global zone of our Solaris 11.3 SRU 34 host. Apache is run by using the (default) MPM model "event", PHP accessed via PHP-FPM. From time to time (irregularly, about one time in two weeks), we see freezes of Apache: While Apache's port 443 is still open, it just freezes any newly opened connection. In this case, typically more than hundred TCP connections between a client and our server in "CLOSE_WAIT" state which will stay there forever.

We don't see anything of big interest in Apache's or Nextcloud's logs. The connections which are stuck in CLOSE_WAIT state were caused by simple accesses to Nextcloud resources (e.g. "PROPFIND /nextcloud/remote.php/dav/files/... HTTP/1.1") - in this case most probably by the Nextcloud client software (which uses WebDAV so synchronize with the server). Restarting the PHP-FPM daemon doesn't solve the problem. The connections between client and Apache still stay in CLOSE_WAIT state. Only restarting Apache solves it.

Interestingly, sending Apache's processes a SIGKILL doesn't remove the CLOSE_WAIT TCP connections immediately, "netstat -aun" shows them for a few minutes longer (by referencing Apache's PID which isn't existing anymore).

I activated a "mod_status" page in Apache which I save every few seconds to debug this further when it occurs for the next time. I think the freeze of Apache is caused by reaching a certain limit in the number of connections. But why do the CLOSE_WAITs occur in the first place? The problem didn't start with SRU 34. We have been experiencing it for a few months, but (must probably due to increasing load) we see it a bit more often recently. We also experience a similar problem on another non-global zone running on the same machine which is primarily used for serving the learning management system Moodle 3.1.12+ on PHP 5.6.36 and the same Apache version.

Does anybody have an idea how to debug it further? Does anybody else experience it? Our Apache configuration is quite close to Solaris'/Apache's default.

Thank you very much in advance for any help!

Kind regards,

Steffen

Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Sep 11 2018
Added on Jul 21 2018
18 comments
2,900 views