We have recently increased our use of Norbert, which means more traffic on our cluster.
Using default settings on server and client side, we have started to see ClosedChannelException in some cases. We have seen three different failure scenarios, but they all boil down to this:
In ChannelPool, sendRequest will writeRequestToChannel(), which uses async write. Then it calls checkinChannel. This in turn will check if the channel should be closed or keept using reuseChannel. With default config, a channel is closed 30s after it is created.
If it has expired, it calls poolEntry.channel.close().
The problems this causes manifests themselfs in three ways:
1. Request gets sent, but no response is received since the socket was closed. (most common)
2. ClosedChannelException on write
3. ClosedChannelException on close, since write buffer is not empty
1 is very easy to reproduce, just set clientconfig closeChannelTimeMillis to 1ms and dispatch a few requests.
Scenario 2 seems a bit trickier, but it happens almost every time my app is newly inited.
Scenario 3 has only been seen in production, but stacktrace hints this is the problem.
Solution: Not sure, I'm too unfamiliar with netty internals to get a solution from the top of my head.. But closing it after any pending requests have finished maybe?
Workaround: set closeChannelTimeMillis to -1 or 0
This seems to be the cause for as well.
For scenario 1 there is no stack trace, the Future just doesn't resolve, since no response ever arrives.
The async write fails, since the socket has been closed.
The close fails, since the async write has not yet completed.