Hi Experts,
Our setup:
database: oracle 11.2.0.4
goldengate: 12.2.0.1.160823
linux: 2.6.32.43
JRE: 1.8.0_121
one extract process, classic capture mode
12 replicat processes, through ogg big data adapter, to flume exit
no remote trail used. extract and adapter processes locate in the same box.
Problem:
After deployment, the setup runs smoothly for days or so, and then randomly we found lag, reported by "info all" on ogg console, and also by our client program.
The lag seems quite random, not time nor workload related. some times it will catch up in a few hours, some times it gets worse and lag reported can reach 12 hours.
output of capture and adapter processes look like below:
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING EXT 00:00:00 00:00:01
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REP01 13:57:51 00:00:07
REPLICAT RUNNING REP02 13:35:07 00:00:03
REPLICAT RUNNING REP03 13:45:37 00:00:00
REPLICAT RUNNING REP04 15:00:40 00:00:04
REPLICAT RUNNING REP05 13:15:52 00:00:09
REPLICAT RUNNING REP06 14:09:47 00:00:09
REPLICAT RUNNING REP07 14:17:53 00:00:07
REPLICAT RUNNING REP08 14:41:56 00:00:09
REPLICAT RUNNING REP09 15:18:12 00:00:06
REPLICAT RUNNING REP10 14:40:01 00:00:03
REPLICAT RUNNING REP11 14:30:23 00:00:05
REPLICAT RUNNING REP12 13:30:47 00:00:09
The server running database and ogg is quite powerful, and idle ( both cpu and io util are lower than 5%), we can't find any bottleneck in that server.
and we have several similar setups deployed, this one is the only one on which we hit the problem, and the others are running just fine. We also tried to migrate this setup to another server, but no luck. But this at least ruled out environment problem.
We looked into ggserror.log and dirrpt directory, no error nor warnning was found.
We also tried troubleshooting methods described in mysupport. We tried different parameters like grouptransops, maxtransops, batchsql, etc. and no luck.
This has haunted us for months, can anyone give a clue?
Thanks for any reply.
- Todd