Peter (psa) writes:
Over the weekend we experienced a significant denial of service attack
at the IP address of the jabber.org IM service, forcing us to take the
service offline and then bring it back online in a limited capacity
through some DNS hacks. In doing so, we discovered that domains
running Openfire were unable to connect to the service at a fallback
domain that we set up in our DNS SRV records:
$ dig +short -t SRV _xmpp-server._tcp.jabber.org
30 30 5269 hermes.jabber.org.
30 30 5269 hermes6.jabber.org.
31 31 5269 fallback.jabber.org.
(hermes and hermes6 are the main IP addresses, and fallback is the
secondary address we set up temporarily)
I thought you might want to be aware of this issue and test your SRV
code at some point. (You can still test this configuration on
jabber.org, but hermes is now back online.)
I believe there are two root causes for the problem identified by PSA:
Openfire returns a list of hosts that are applicable for a particular XMPP domain (as advertised by DNS SRV records). Openfire fails to randomize connection attempts over hosts that share the highest priority value (it does prioritize all hosts properly, but does not apply any weighted randomization based on weight values).
Every connection attempt after the first one will fail, as the (previously failed) socket connection gets re-used (which at that time is already closed).
I've applied a fix for both causes to trunk a couple of minutes ago.