The state of the Openfire Hazelcast cluster has an issue.
The org.jivesoftware.openfire.cluster.ClusterManager class has 2 methods which give the state of the cluster - isClusteringStarting() and isClusteringStarted(). These methods get the values from org.jivesoftware.util.cache.CacheFactory class. This class has 2 boolean variables "clusteringStarting" and "clusteringStarted". In the startClustering() method, clusteredCacheFactoryStrategy.startCluster() is called and the return value is assigned to "clusteringStarting". In the startCluster() method of com.jivesoftware.util.cache.ClusteredCacheFactory class, Hazelcast is initialized and then ClusterListener constructor is called. In that flow, the CacheFactory.joinedCluster() method is called and there "clusteringStarting" is set to "false" and "clusteringStarted" is set to "true". But after this, the startCluster() method evaluates "cluster != null" and returns true which is assigned to the variable "clusteringStarting" in the CacheFactory.startClustering() method. Finally, when the cluster has started successfully, we end up getting both "clusteringStarting" and "clusteringStarted" set to true. This is a problem and it creates a memory leak during group chat scenarios.
The abstract class org.jivesoftware.openfire.muc.cluster.MUCRoomTask executes tasks related to rooms in a cluster. In the "execute" method, ClusterManager.isClusteringStarting() is called. If a room is not found, an IllegalArgumentException is caught, and in the exception handling block, if the clusterStarting value is true, the task is added to a queue in QueuedTasksManager. The QueuedTasksManager will remove the task only when clusterStarting is false. Because clusterStarting is always set to true, the tasks get slowly added in the queue (whenever the room is not found due to some issues) without getting removed resulting in a memory leak.
Request you to please log a bug for this.
Hi Tom, Do you have any thoughts on this issue? thanks!
I seem to recall some fixes in this area, but am unsure if they are addressing this particular issue. Anybody still watching this issue can comment?