Thread: Log-cleaning and threads

This question is answered. Helpful answers available: 5. Correct answers available: 1.


Permlink Replies: 6 - Pages: 1 - Last Post: May 28, 2009 8:14 AM Last Post By: user10944705
user10944705

Posts: 5
Registered: 05/20/09
Log-cleaning and threads
Posted: May 26, 2009 10:24 AM
 
Click to report abuse...   Click to reply to this thread Reply
I'm trying to understand BDB JE Environment.cleanLog() and threads a little better. I've looked into the product documentation and JavaDoc, and this forum, but haven't been able to find the answers yet.

Here's what I know so far:

By default, log-cleaning is handled by a daemon thread.

It's possible to increase the number of log-cleaning daemon threads by using EnvironmentConfig.setConfigParam(EnvironmentConfig.CLEANER_THREADS, "N").

It's also possible to disable the daemon thread altogether by using EnvironmentConfig.setConfigParam(EnvironmentCObnfig.ENV_RUN_CLEANER, "false"), and call cleanLog() directly from the application. For greatest cleanliness, there's an algorithm suggested in the JavaDoc for Environment.cleanLog().

These facts leave me with the core question: what's the relationship amongst these things? For example, here are some sub-questions that cross my mind (there may be others I haven't thought of):

(1) Does the daemon thread implement the suggested algorithm? If not, what algorithm does it implement?

(2) If there are multiple daemon threads, what joint algorithm do they implement, and how do the interact? Do they, for example, lock one another out at times?

(3) What are the decision criteria a developer should use concerning using multiple daemon threads? How is multiple threads "better" (or "worse") than a single thread? How does one decide how many threads is good for a given application?

(4) If one is using an app-controlled cleaner thread, what guarantees does cleanLog() give? That is, if the suggested algorithm isn't implemented, what are some alternatives, and why?

(5) Suppose an app-controlled thread implements an infinite loop of the suggested algorithm. What are the implications?

(6) Does multiple app-controlled threads make sense? What happens if multiple app-controlled threads all implement the infinite loop suggested algorithm just mentioned? For example, does one thread lock the others out, such that the result amounts to the same thing as a single thread implementing the suggested algorithm?

(7) Does a combination of daemon threads and app-controlled threads make sense?

Thank you.
greybird

Posts: 1,296
Registered: 07/13/06
Re: Log-cleaning and threads
Posted: May 26, 2009 10:53 AM   in response to: user10944705 in response to: user10944705
 
Click to report abuse...   Click to reply to this thread Reply
Hi,

(1) Does the daemon thread implement the suggested algorithm? If not, what algorithm does it implement?

No. The suggested algorithm is simply to force files to be deleted synchronously. The cleaner thread processes files, the checkpointer deletes them at the end of the checkpoint. The cleaner and checkpointer thread are running continuously. Normally, you don't need to do anything special.

(2) If there are multiple daemon threads, what joint algorithm do they implement, and how do the interact? Do they, for example, lock one another out at times?

Each thread works on a different log file, so they don't block each other. If that wasn't true, there would be no value to having multiple threads.

(3) What are the decision criteria a developer should use concerning using multiple daemon threads? How is multiple threads "better" (or "worse") than a single thread? How does one decide how many threads is good for a given application?

If the cleaner is behind -- backlogged -- you should increase the number of threads and perhaps the cache size. The cleanerBacklog is a property of EnvironmentStats. All apps should monitor the environment stats by calling Environment.getStats periodically.

(4) If one is using an app-controlled cleaner thread, what guarantees does cleanLog() give? That is, if the suggested algorithm isn't implemented, what are some alternatives, and why?

You need a checkpoint after the cleanLog to delete the cleaned files. Checkpoints happen periodically on the configured checkpoint interval, so unless you've disabled this, you don't need to do anything.

(5) Suppose an app-controlled thread implements an infinite loop of the suggested algorithm. What are the implications?

None that I know of.

(6) Does multiple app-controlled threads make sense? What happens if multiple app-controlled threads all implement the infinite loop suggested algorithm just mentioned? For example, does one thread lock the others out, such that the result amounts to the same thing as a single thread implementing the suggested algorithm?

App controlled threads are only needed if you choose to disable the JE daemon threads. In general, you won't need to do this.

(7) Does a combination of daemon threads and app-controlled threads make sense?

No.

Normally, the use of JE cleaner threads is very simple. Normally you should not create app cleaner threads and should just allow the JE daemon cleaner thread to operate. Please increase the number of JE cleaner threads if the backlog grows, to allow the cleaner to keep up with your application's write activity.

Are you asking these questions because the cleaner is not keeping up? The solution to that problem is not to implement cleaning in your app thread.

--mark
user10944705

Posts: 5
Registered: 05/20/09
Re: Log-cleaning and threads
Posted: May 27, 2009 6:15 AM   in response to: greybird in response to: greybird
 
Click to report abuse...   Click to reply to this thread Reply
Thank you, Mark, for these answers.

The reason I asked the questions is that earlier (last week), I did have some problems with the cleaner keeping up, and I posted a query to another thread. You answered that query adequately, namely, I had my cache setting too low (a consequence of misunderstanding BDB JE's non-standard use of the term "cache"). The problem disappeared when I increased the cache size. But the experience led me to explore the cleaning APIs more closely, and that raised the questions I asked in this thread.

I think you have now supplied me with the answers I was looking for, provided I understand the answers correctly. To verify that, and especially to try to explicate the advice that "normally you don't need to do anything special", I'll try to summarize (please let me know if I have anything wrong):

Assume throughout you're using a cache size that's reasonable for your app (say several megabytes, though further discussion of this point is a topic of its own). In general, the single/default daemon cleaner thread (together with the checkpointer) suffices. The main/only time you might ever want to increase the number of daemon threads is if you observe a backlog. The main/only time you might ever want to turn off the daemon thread is if you want to boost mainline DB perf for some reason. And that is, therefore also the main/only time you might ever want to use app-controlled (non-daemon) cleaner-threads -- namely, run cleanLog() outside the sections where you want to boost mainline DB perf. Again, just as for daemon threads, you would generally only use a single app-controlled thread. But if you observe a backlog, you can increase the number of app-controlled threads (because cleanLog() is thread-aware, e.g., it makes the threads work on separate files). You can use the suggested algorithm (the one in the cleanLog() JavaDoc) in your app-controlled threads freely if you wish, though you can also separate the checkpointer out into its own schedule if you wish.
greybird

Posts: 1,296
Registered: 07/13/06
Re: Log-cleaning and threads
Posted: May 27, 2009 11:46 AM   in response to: user10944705 in response to: user10944705
 
Click to report abuse...   Click to reply to this thread Reply
Hi Walt,

Your description is very close to being correct. Here's my attempt to list the possible reasons for disabling the JE cleaner threads and for calling Environment.cleanLog. Your persistence and clear descriptions have motivated us to add more documentation -- we'll be working on that, starting with the information below.



The JE cleaner daemon thread(s) are enabled by default. Normally this should not be changed. Possible reasons for disabling the JE cleaner threads are:

1) You may wish to disable the JE cleaner threads during heavy application usage periods and only run the log cleaner when application usage is light (e.g., at 2 am). This can increase throughput during heavy usage periods. However, caution is strongly advised. If the write rate is high during the heavy usage period, filling the disk is a possibility and must be avoided. You must also ensure that there is enough time during light usage periods for the log cleaner to catch up with the backlog created during the heavy usage periods. In addition, random reads may be negatively impacted during the heavy usage periods if the JE log grows very large, because there may be less hits in the file system cache.

2) You may wish to disable the JE cleaner and checkpointer threads when performing a "bulk load". A bulk load is a large set of writes, usually inserts but sometimes also updates and deletions, that is performed in a batch mode while all other application functions are disabled. It is used to initialize a large data set. The objective is to complete the load as quickly as possible and to use as little disk space as possible. Note that deferred write mode (see DatabaseConfig.setDeferredWrite) is often used for a bulk load to minimize writing.

Checkpointing can be disabled to avoid wasting disk space with multiple, redundant checkpoints during the load. Instead a single checkpoint is performed after the load is complete. This is acceptable because recovery time does not need to be bounded by checkpoints -- if a crash occurs during the load, the load can be restarted from scratch. Log cleaning can also be disabled to speed up the load. If only insertions are performed, then log cleaning will not be needed anyway. But even if updates and deletions are performed, log cleaning is not productive while the checkpointer is disabled since log files will not be deleted. Log cleaning may be performed efficiently by calling cleanLog at the end of the load, followed by a checkpoint.

3) You may wish to implement your own log cleaning threads for administrative reasons. Perhaps you have a special thread pool you wish to use, or you're sharing a thread pool with other components. In this case, your threads take on the same role as the JE daemon threads. Your threads should call Environment.cleanLog periodically. The number of threads calling cleanLog should be increased when the EnvironmentStats cleanerBacklog value grows. A checkpoint is not normally necessary, since checkpoints should occur independently on their own schedule. But if you also disable the JE checkpointer thread, then you should call Environment.checkpoint periodically from your own thread.

4) Using a NAS (e.g., NFS) for JE storage can be problematic for several reasons. For one, the EnvironmentConfig.LOG_USE_ODSYNC parameter must be set. In addition, if the NAS does not support the file locking needed by JE, then running multiple processes is problematic. JE cannot use file locking, and therefore cannot coordinate multiple processes accessing the same environment. It is then up to the application to ensure that only one process is writing to the environment, and that log cleaning is disabled when any read-only processes are open. The log cleaner threads may need to be disabled by the application in such situations.



Below are some example use cases where calling Environment.cleanLog is needed.

A) If you implement your own log cleaning threads (3) then you should call cleanLog periodically. The JE daemon threads effectively call cleanLog after each N bytes of log is written, where by default N is 0.25 times the maximum size of a log file, and may be configured using EnvironmentConfig.CLEANER_BYTES_INTERVAL. For simplicity your log cleaning threads may call cleanLog based on a configured time interval. As mentioned above (3), a checkpoint is not normally necessary after calling cleanLog.

B) The JE cleaner threads are triggered by write activity. You may wish to call cleanLog in order to force cleaning to occur when no other write activity is occurring. For example, you may wish to do this at the end of a bulk load (2), or as a utility function. After calling cleanLog, a checkpoint should be performed to cause cleaned log files to be deleted.



For completeness, I'd like to say a little more about checkpoints and log cleaning. As mentioned above, a checkpoint is necessary after the log cleaner has "processed" a log file, and before the file can be deleted. The log cleaner (the JE cleaner threads and the cleanLog method) process log files by migrating all active data from that file to the end of the log. The checkpoint is necessary before deleting the file, to ensure that no references to that log file remain active.

In addition, the checkpoint does a lot of the work -- the heavy lifting -- of log cleaning. When a log file is processed, the active data is placed in memory. But it is left to the checkpoint to write the active data to the end of the log. This has several advantages:

1) It offloads some of the work from the log cleaner threads, so they can make better progress and keep up with the application threads.

2) It reduces the total amount of writing by deferring it for as long as possible. Multiple updates to the active data are consolidated when writing is deferred until the next checkpoint.

3) Data is clustered naturally when writing is deferred. Data is written by the checkpointer in groups of records, where the records in a group have key values in close proximity to each other. For applications having locality of reference by key value, but where the records are initially written in a different order in the log, read performance may be improved.

In some applications, however, this approach can cause very long checkpoints, with negative repercussions. In particular, this can occur when the JE cache is very large (e.g., multiple GB) and the write rate is high. Because of the large cache, write activity and related log cleaner activity can queue up a large amount of work that must be done during each checkpoint. If the checkpoint takes too long (if it spans many log files) then the recovery interval may be very long also, and recovery after a crash may take a very long time. Long checkpoints also prevent cleaned log files from being deleted promptly.

For such applications, the EnvironmentConfig.CHECKPOINTER_HIGH_PRIORITY configuration parameter should be set to true. This causes two changes in behavior:

a) The log cleaner threads (and the cleanLog method) will write active data to the end of the log, rather than leaving this work to be done by the checkpointer.

b) The checkpointer will log multiple Btree nodes at a time, reducing contention with other threads.

Both of these changes cause the checkpoint to complete in much less time. This can have a significant positive impact on overall performance. If your application has long checkpoints (as usual, watch the EnvironmentStats), you should consider this option.

If you use this option, it is very likely that you'll also need to increase the number of log cleaner threads. The checkpointer will be doing less work, but the log cleaner thread(s) will be doing more work. Therefore more log cleaner threads will probably be needed to prevent the cleaner backlog from growing.

--mark
Gojomo

Posts: 30
Registered: 07/24/06
Re: Log-cleaning and threads
Posted: May 27, 2009 5:23 PM   in response to: greybird in response to: greybird
 
Click to report abuse...   Click to reply to this thread Reply
I think I followed the above until the last paragraph about increasing the number of log cleaner threads. How do more log cleaner threads help things catch up -- won't they just get in each others' way with various kinds of contention? I would normally think a single privileged cleaner thread -- taking locks if it needs to -- would be the most efficient way to catch up on a backlog... but perhaps I'm missing something?

- Gordon @ IA
greybird

Posts: 1,296
Registered: 07/13/06
Re: Log-cleaning and threads
Posted: May 27, 2009 5:32 PM   in response to: Gojomo in response to: Gojomo
 
Click to report abuse...   Click to reply to this thread Reply
Hi Gordon,

With multiple cleaner threads, each thread works on a separate log file. Of course, there is still contention on Btree access. But in fact the Btree contention is the main reason for using multiple cleaner threads, in order to balance multiple app threads. If there are 10 app threads and lots of "waste" being generated, but only a single cleaner thread, the cleaner thread won't get enough slices to keep up. If there is only a single app thread, then a single cleaner thread should be sufficient.

--mark
user10944705

Posts: 5
Registered: 05/20/09
Re: Log-cleaning and threads
Posted: May 28, 2009 8:14 AM   in response to: greybird in response to: greybird
 
Click to report abuse...   Click to reply to this thread Reply
Thanks Mark, this is exactly what I wanted to know.

I was especially interested to see the additional gloss on CHECKPOINTER_HIGH_PRIORITY. Before I saw that, I was thinking: why not change the suggested algorithm from (pidgen code):

do {
cleanLog()
} while (someCleaned);
if (anyCleaned)
checkpoint();

to:

do {
cleanLog()
checkpoint()
} while (someCleaned);

Anyway, I think I can now safely mark this thread, "Yes, my question has been answered." :-)
Legend
Guru Guru : 2500 - 1000000 pts
Expert Expert : 1000 - 2499 pts
Pro Pro : 500 - 999 pts
Journeyman Journeyman : 200 - 499 pts
Newbie Newbie : 0 - 199 pts
Oracle ACE Director
Oracle ACE Member
Oracle Employee ACE
Helpful Answer (5 pts)
Correct Answer (10 pts)

Point your RSS reader here for a feed of the latest messages in all forums