When the __db file size limit is reached, libdb calls lseek() and write() to extend the file.
19 int
20 __db_file_extend(env, fhp, size)
<SNIP>
51 pages = (db_pgno_t)((size - sizeof(buf)) / MEGABYTE);
52 relative = (u_int32_t)((size - sizeof(buf)) % MEGABYTE);
53 if ((ret = __os_seek(env, fhp, pages, MEGABYTE, relative)) == 0)
54 ret = __os_write(env, fhp, &buf, sizeof(buf), &nw);
However, after expanding the __db file, it is not syncing with memory.
mmap has the following notes.
From mmap(2) manpage:
A file is mapped in multiples of the page size. For a file that is not
a multiple of the page size, the remaining memory is zeroed when
mapped, and writes to that region are not written out to the file. The
effect of changing the size of the underlying file of a mapping on the
pages that correspond to added or removed regions of the file is
unspecified.
From mmap in IEEE Std 1003.1-2017:
If the size of the mapped file changes after the call to mmap() as a result of
some other operation on the mapped file, the effect of references to portions
of the mapped region that correspond to added or removed portions of the file
is unspecified.
Therefore, one of the following deals are required.
- Call munmap()/mmap() to synchronize memory during lseek() and write()
- Get the maximum size __db file in advance instead of gradually expanding the __db file.
The following is the latter patch:
--------
diff --git a/db-18.1.40/src/os/os_map.c b/db-18.1.40/src/os/os_map.c
index dcf2c23..83a79a8 100644
--- a/db-18.1.40/src/os/os_map.c
+++ b/db-18.1.40/src/os/os_map.c
@@ -231,15 +231,7 @@ __os_attach(env, infop, rp)
if (rp->max < rp->size)
rp->max = rp->size;
if (ret == 0 && F_ISSET(infop, REGION_CREATE)) {
-#ifdef HAVE_MLOCK
- /*
- * When locking the region in memory extend it fully so that it
- * can all be mlock()'d now, and not later when paging could
- * interfere with the application. [#21379]
- */
- if (F_ISSET(env, ENV_LOCKDOWN))
- rp->size = rp->max;
-#endif
-
rp->size = rp->max;
if (F\_ISSET(dbenv, DB\_ENV\_REGION\_INIT))
ret = \_\_db\_file\_write(env, infop->fhp,
rp->size / MEGABYTE, rp->size % MEGABYTE, 0x00);
--------
Here's the patch:
https://github.com/miztake/db-18.1.40/commit/45446f3857e6a972a3477ac01b8fb578fce03056
Do you have any opinions about this patch?