Skip to Main Content

Security Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

LDAP replicas seem to differ but don't replicate

jimklimovNov 19 2008 — edited Nov 19 2008
I've recently noted a couple of replicated DS6.3 servers which report different stats about records in a suffix.
Sometimes their reports differ by 1 (i.e. 43 vs 42), sometimes almost twice (i.e. 115 vs 228).

They have two-way replication enabled and no changes are pending.

I have tried to export data to an LDIF file to compare the replicas, but the strings are randomly sorted within an entry.
I discovered however, that the LDIF file has one more entry (i.e. 229 in the last example) which is probably the replication data entry.
LDIF files from both servers have the same amount of lines starting with 'dn: ', but the file sizes differ by a few kilobytes.

Q?: Is there anything serious in this situation alone (perhaps some counter is stale and it only affects admins' GUI usability)?
Q?: How can I simply compare the two LDAP replicas to see what the changes/differences are, if any?

Now, for the worse part which may be related: I looked at these servers because they behaved "strangely", i.e. ldapsearch for a
certain uid worked on one server and timed out (froze until killed) on the other.

I tried to re-initialize the suffix using the replication agreement and got all sorts of errors and the ns-slapd process dumped cores
at least twice. Apparently some of the database files could get corrupted for whatever reason, although I haven't caught the
logfiles complaining on this matter.

Ultimately the LDAP server hung and was restarted. On one restart when I looked at its suffixes it reported that all of its suffixes
are not initialized and are Consumers. They did not accept initialization from Replication Agreement though.

On another restart some suffixes became multimaster replicas with reportedly-working agrements. However, those suffixes I have
tried to re-initialize don't respond now. They are uninitialized and refuse to be initialized via Replication Agreement.

At least one smaller suffix was successfully initialized via Replication though. I can only guess its database files were in working order.

The typical errors in the logfile are:
[19/Nov/2008:15:19:09 +0300] - DEBUG - conn=-1 op=-1 msgId=-1 -  slapd_poll(129) timed out
[19/Nov/2008:15:42:05 +0300] - ERROR<8283> - Replication  - conn=391 op=2 msgId=3 - Replica already busy Failed to start Replication Session for suffix dc=domain,dc=ru.
Usually after I get several hundred of these lines in a few minutes (conn= values did even grow to thousands within an hour of ns-slapd
startup), I see something like this statistics:
[19/Nov/2008:14:19:48 +0300] - import domain.ru: Processed 0 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit ratio 0%
Once the import thought it succeeded... for a short while:
[19/Nov/2008:14:20:14 +0300] - import domain.ru: Import complete.  Processed 0 entries in 269 seconds. (0.00 entries/sec)
[19/Nov/2008:14:20:14 +0300] - ERROR<24577> - Bulk Import - conn=-1 op=-1 msgId=-1 - Internal error  bulk import process failed: state = 7, error code = -1.
Here's the server startup after core-dumping:
[19/Nov/2008:15:04:02 +0300] - Sun-Java(tm)-System-Directory/6.3 B2008.0311.0212 (64-bit) starting up
[19/Nov/2008:15:04:02 +0300] - WARNING<20488> - Backend Database - conn=-1 op=-1 msgId=-1 -  Detected Disorderly Shutdown last time Directory Server was running, recovering database.
...
[19/Nov/2008:15:04:07 +0300] - WARNING<10276> - Incremental Protocol - conn=-1 op=-1 msgId=-1 - Replication inconsistency Consumer Replica "ldap01:636/o=domain.ru,dc=domain,dc=ru" has a different data version. It may have not been initialized yet.
Q?: Any idea what this situation means, except that I have to carefully roll back from backups? :)
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Dec 17 2008
Added on Nov 19 2008
2 comments
830 views