ClosedChannelException gets lost

Description

I am occasionally but consistently getting ClosedChannelException. It seems to be on the service side message handling code, before deserializing according to my trials. The stack trace I get is below.

I'm not sure if it's normal to get this now and again. But the problem is I also can't detect / respond to this. The associated Future's has isCompleted() = false, isDone() = false, and the get() function just hangs indefinitely, rather than giving an ExecutionException as I might expect, so this exception seems to be getting lost somewhere.

Note, I tried upgrading the netty jar to 3.2.9 and also protobuf to 2.4.1, but got the same problem.

10:39:07 [norbert-message-executor-thread-4] INFO netty.ServerChannelHandler - Caught exception in channel: [id: 0x037b9c77, /10.1.5.150:60030 :> /10.1.5.150:3940]
java.nio.channels.ClosedChannelException
at org.jboss.netty.channel.socket.nio.NioWorker.cleanUpWriteBuffer(NioWorker.java:637)
at org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:370)
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137)
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76)
at org.jboss.netty.channel.Channels.write(Channels.java:632)
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70)
at org.jboss.netty.channel.Channels.write(Channels.java:632)
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70)
at org.jboss.netty.channel.Channels.write(Channels.java:632)
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70)
at com.linkedin.norbert.network.netty.ServerFilterChannelHandler.handleDownstream(ServerChannelHandler.scala:86)
at org.jboss.netty.channel.Channels.write(Channels.java:611)
at org.jboss.netty.channel.Channels.write(Channels.java:578)
at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259)
at com.linkedin.norbert.network.netty.ServerChannelHandler.responseHandler(ServerChannelHandler.scala:163)
at com.linkedin.norbert.network.netty.ServerChannelHandler$$anonfun$messageReceived$1.apply(ServerChannelHandler.scala:138)
at com.linkedin.norbert.network.netty.ServerChannelHandler$$anonfun$messageReceived$1.apply(ServerChannelHandler.scala:137)
at com.linkedin.norbert.network.server.ThreadPoolMessageExecutor$RequestRunner$$anonfun$run$3.apply(MessageExecutorComponent.scala:152)
at com.linkedin.norbert.network.server.ThreadPoolMessageExecutor$RequestRunner$$anonfun$run$3.apply(MessageExecutorComponent.scala:151)
at scala.Option.foreach(Option.scala:185)
at com.linkedin.norbert.network.server.ThreadPoolMessageExecutor$RequestRunner.run(MessageExecutorComponent.scala:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

Environment

Mac OS X 10.8.3, Norbert 2.8.1

6361 Mar 11 15:13 asm-1.5.3.jar
282338 Mar 11 15:13 cglib-2.1_3.jar
87325 Mar 11 15:13 jline-0.9.94.jar
121070 Mar 11 15:13 junit-3.8.1.jar
481535 Mar 11 15:13 log4j-1.2.16.jar
1409262 Mar 11 15:13 mockito-all-1.8.4.jar
786229 Mar 11 15:13 netty-3.2.3.Final.jar
1281711 Mar 11 14:51 norbert_2.8.1-0.6.30.jar
28569 Mar 11 15:13 objenesis-1.0.jar
449818 Mar 11 15:13 protobuf-java-2.4.0a.jar
8984943 Mar 11 15:13 scala-compiler-2.8.1.jar
6496110 Mar 11 15:13 scala-library-2.8.1.jar
22338 Mar 11 15:13 slf4j-api-1.5.6.jar
9678 Mar 11 15:13 slf4j-log4j12-1.5.6.jar
2865273 Mar 11 15:13 specs_2.8.1-1.6.7.jar
589023 Mar 11 15:13 zookeeper-3.3.0.jar

Activity

Show:
Ron Siemens
March 28, 2013, 5:45 PM

I've found NetworkClientConfig has setCloseChannelTimeMillis() which seems to be the timeout that automatically closed the channel in the cases of particularly long work items. The default is 30s. By upping it, I can reduce the probability of the ClosedChannelException. But the client should still have a way to detect that this occurred.

Sandris A.
April 11, 2013, 11:54 AM

The same problem, what are the possibilities to track this exception if it cannot be caught? Sending small messages between servers? When can we wait for some fix?

Ron Siemens
April 11, 2013, 5:43 PM

I've managed this situation by never using future's no-argument get(). I only use future's timeout get( length, TimeUnit.X ), with the time set to the same as the automatic CloseChannelTime. While not perfect, at least my client will now likewise timeout, and can respond as it deems necessary.

Sandris A.
April 12, 2013, 5:14 PM

Thank you for your answer. One more question: is there any possibility to track in the other side, to reopen channel?

Assignee

Joshua Hartman

Reporter

Ron Siemens

Labels

None

Components

Affects versions

Priority

Major
Configure