Coherence version: 12.2.1.4.0
We are using ActiveActive federation across two clusters (one cluster in London and one cluster in New York). Today we had prolonged network glitch which slowed down communication between our London and New York servers.
What we have observed is due to the cross-Atlantic network delay, local writes to a cluster were getting backed up as well. Can you please check if this is a bug or expected behaviiour? If it is the latter then, is there anyway we can configure Federation so that the replication flow does not back-up local writes?
We were getting below logs like below during the network glitch:
2020-01-05T23:41:08,356 WARN [Logger@9237753 12.2.1.4.0][Coherence] (thread=SelectionService(channels=7, selector=MultiplexedSelector(sun.nio.ch.EPollSelectorImpl@7e905460), id=150693841), member=4) tmb://10.53.200.125:9300.45391 accepted connection migration with tmb://10.53.200.126:9300.54452 on MultiplexedSocketChannel(MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52414]}): peer=tmb://10.53.200.126:9300.54452, state=ACTIVE, socket=MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52414]}, migrations=6, bytes(in=13683513, out=23235313), flushlock false, bufferedOut=6.95KB, unflushed=0B, delivered(in=54117, out=58173), timeout(ack=7.49s), interestOps=1, unflushed receipt=0, receiptReturn 0, isReceiptFlushRequired false, bufferedIn(), msgs(in=28914, out=29393/29412)
java.io.IOException: ack timeout after 15s
at com.oracle.common.internal.net.socketbus.BufferedSocketBus$BufferedConnection.checkHealth(BufferedSocketBus.java:890)
at com.oracle.common.internal.net.socketbus.AbstractSocketBus$5.lambda$run$0(AbstractSocketBus.java:644)
at com.oracle.common.internal.net.socketbus.AbstractSocketBus$5$$Lambda$208/626754434.accept(Unknown Source)
at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
at com.oracle.common.internal.net.socketbus.AbstractSocketBus$5.run(AbstractSocketBus.java:644)
at com.oracle.common.internal.net.socketbus.AbstractSocketBus$3.run(AbstractSocketBus.java:426)
at com.oracle.common.internal.net.RunnableSelectionService.processRunnables(RunnableSelectionService.java:533)
at com.oracle.common.internal.net.RunnableSelectionService.process(RunnableSelectionService.java:349)
at com.oracle.common.internal.net.RunnableSelectionService.run(RunnableSelectionService.java:274)
at com.oracle.common.internal.net.ResumableSelectionService.run(ResumableSelectionService.java:133)
at java.lang.Thread.run(Thread.java:745)
2020-01-05T23:41:33,974 WARN [Logger@9237753 12.2.1.4.0][Coherence] (thread=SelectionService(channels=15, selector=MultiplexedSelector(sun.nio.ch.EPollSelectorImpl@2b773ba8), id=505818397), member=4) tmb://10.53.200.125:9300.45391 accepted connection migration with tmb://10.53.200.126:9300.54452 on MultiplexedSocketChannel(MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52530]}): peer=tmb://10.53.200.126:9300.54452, state=ACTIVE, socket=MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52530]}, migrations=7, bytes(in=13766783, out=23336481), flushlock false, bufferedOut=14.4KB, unflushed=0B, delivered(in=54389, out=58422), timeout(ack=2.13s), interestOps=1, unflushed receipt=0, receiptReturn 0, isReceiptFlushRequired false, bufferedIn(), msgs(in=29054, out=29524/29537)
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.readv0(Native Method)
at sun.nio.ch.SocketDispatcher.readv(SocketDispatcher.java:43)
at sun.nio.ch.IOUtil.read(IOUtil.java:278)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:435)
at com.oracle.common.internal.net.WrapperSocketChannel.read(WrapperSocketChannel.java:130)
at com.oracle.common.internal.net.MultiplexedSocketProvider$MultiplexedSocketChannel.read(MultiplexedSocketProvider.java:1547)
at com.oracle.common.internal.net.socketbus.AbstractSocketBus$Connection.read(AbstractSocketBus.java:1956)
at com.oracle.common.internal.net.socketbus.BufferedSocketBus$BufferedConnection.read(BufferedSocketBus.java:93)
at com.oracle.common.internal.net.socketbus.SocketMessageBus$MessageConnection$ReadBatch.read(SocketMessageBus.java:615)
at com.oracle.common.internal.net.socketbus.SocketMessageBus$MessageConnection.processReads(SocketMessageBus.java:206)
at com.oracle.common.internal.net.socketbus.BufferedSocketBus$BufferedConnection.onReadySafe(BufferedSocketBus.java:700)
at com.oracle.common.internal.net.socketbus.AbstractSocketBus$Connection.onReady(AbstractSocketBus.java:2135)
at com.oracle.common.internal.net.RunnableSelectionService.process(RunnableSelectionService.java:401)
at com.oracle.common.internal.net.RunnableSelectionService.run(RunnableSelectionService.java:274)
at com.oracle.common.internal.net.ResumableSelectionService.run(ResumableSelectionService.java:133)
at java.lang.Thread.run(Thread.java:745)