We're updating the issue view to help you get more done. 

Thread pool for new s2s connections may get exhausted when remote servers are unresponsive

Description

After establishing a new connection to a remote server (secured connection or server dialback) a stream header is sent and another is expected from the remote server. The problem is that if the connection was lost (and the JVM never realized of that) or for some other reason the remote server never responds then Wildfire will wait forever thus posibly blocking other threads that are trying to contact the same server.

A future enhancement will include modifying OutgoingSessionPromise to just use one thread per queued domain so that if a remote domain is not responding then only one thread will be consumed. Thus making a smarter usage of the thread pool.

Example of thread dumps due to this issue:

"pool-3-thread-5" prio=1 tid=0x08a730b0 nid=0x3ed4 runnable [0x698ff000..0x698ff570]
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java:411)
at sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java:453)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:183)

  • locked <0x73a8d950> (a java.io.InputStreamReader)
    at java.io.InputStreamReader.read(InputStreamReader.java:167)
    at org.xmlpull.mxp1.MXParser.fillBuf(MXParser.java:2971)
    at org.xmlpull.mxp1.MXParser.more(MXParser.java:3025)
    at org.xmlpull.mxp1.MXParser.parseProlog(MXParser.java:1410)
    at org.jivesoftware.wildfire.net.MXParser.nextImpl(MXParser.java:331)
    at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)
    at org.jivesoftware.wildfire.server.OutgoingServerSession.createOutgoingSession(OutgoingServerSession.java:284)
    at org.jivesoftware.wildfire.server.OutgoingServerSession.authenticateDomain(OutgoingServerSession.java:140)
    - locked <0x7081c510> (a java.lang.String)
    at org.jivesoftware.wildfire.server.OutgoingSessionPromise.createSessionAndSendPacket(OutgoingSessionPromise.java:126)
    at org.jivesoftware.wildfire.server.OutgoingSessionPromise.access$300(OutgoingSessionPromise.java:37)
    at org.jivesoftware.wildfire.server.OutgoingSessionPromise$1$1.run(OutgoingSessionPromise.java:91)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
    at java.lang.Thread.run(Thread.java:595)

"pool-3-thread-4" prio=1 tid=0x08a74c38 nid=0x3ed3 waiting for monitor entry [0x69dff000..0x69dff5f0]
at org.jivesoftware.wildfire.server.OutgoingServerSession.authenticateDomain(OutgoingServerSession.java:138)

  • waiting to lock <0x7081c510> (a java.lang.String)
    at org.jivesoftware.wildfire.server.OutgoingSessionPromise.createSessionAndSendPacket(OutgoingSessionPromise.java:126)
    at org.jivesoftware.wildfire.server.OutgoingSessionPromise.access$300(OutgoingSessionPromise.java:37)
    at org.jivesoftware.wildfire.server.OutgoingSessionPromise$1$1.run(OutgoingSessionPromise.java:91)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
    at java.lang.Thread.run(Thread.java:595)

Environment

None

Acceptance Test - Entry

None

Assignee

Gaston Dombiak

Reporter

Gaston Dombiak

Labels

None

Expected Effort

None

Ignite Forum URL

None

Components

Fix versions

Affects versions

Priority

Critical
Configure