Peter (psa) writes:

Hallo Guus!

Over the weekend we experienced a significant denial of service attack
at the IP address of the IM service, forcing us to take the
service offline and then bring it back online in a limited capacity
through some DNS hacks. In doing so, we discovered that domains
running Openfire were unable to connect to the service at a fallback
domain that we set up in our DNS SRV records:

$ dig +short -t SRV
30 30 5269
30 30 5269
31 31 5269

(hermes and hermes6 are the main IP addresses, and fallback is the
secondary address we set up temporarily)

I thought you might want to be aware of this issue and test your SRV
code at some point. (You can still test this configuration on, but hermes is now back online.)






Guus der Kinderen
November 4, 2012, 10:18 AM

I've applied a fix for both causes to trunk a couple of minutes ago.

Guus der Kinderen
November 4, 2012, 9:55 AM

I believe there are two root causes for the problem identified by PSA:

  1. Openfire returns a list of hosts that are applicable for a particular XMPP domain (as advertised by DNS SRV records). Openfire fails to randomize connection attempts over hosts that share the highest priority value (it does prioritize all hosts properly, but does not apply any weighted randomization based on weight values).

  2. Every connection attempt after the first one will fail, as the (previously failed) socket connection gets re-used (which at that time is already closed).



