NullPointerException with Pubsub(PEP?) and clustering

Description

I'm not sure if the problem is limited to the following scenario, or if the problem can be reproduced.

  • Running on a two-node cluster, Openfire 4.6.0 Alpha, Hazelcast plugin 2.5.0

  • A user logged in on the junior cluster node

  • On the senior cluster node, this NullPointerException was logged a couple of times in rapid succession (four times in two seconds):

2020.09.09 14:17:03 ERROR [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory - Unexpected exception running CallableTask[org.jivesoftware.o java.lang.NullPointerException: null at org.jivesoftware.openfire.pubsub.cluster.AffiliationTask.run(AffiliationTask.java:47) ~[xmppserver-4.6.0-SNAPSHOT.jar:4.6.0-SNAPSHOT] at org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory$CallableTask.call(ClusteredCacheFactory.java:591) [hazelcast-2.5.0.jar!/:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252] at com.hazelcast.executor.impl.DistributedExecutorService$CallableProcessor.run(DistributedExecutorService.java:270) [hazelcast-3.12.5.jar!/:?] at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227) [hazelcast-3.12.5.jar!/:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252] at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64) [hazelcast-3.12.5.jar!/:?] at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80) [hazelcast-3.12.5.jar!/:?]

Environment

None

Activity

Show:

Dan Caseley December 15, 2020 at 10:07 AM

I can reproduce this pretty easily on Master, but not on the branch.

Repro steps:

  • Have a user on both nodes subscribe to a PubSub node

That’s it. That’s the test.

Have looked at Pubsub, PEP, single servers and clustered operations.

Fix LGTM thumbs up

Guus der Kinderen November 11, 2020 at 1:17 PM

This problem is caused by an oversight in the implementation in various cluster tasks. A cluster task is primarily intended to update the in-memory representation of an entity that was modified on the cluster node that originates the task (it’s somewhat of a clean-up action). As such, it is often not needed to perform actions, when the entity that it relates to isn’t loaded in memory of a particular node. The implementation of various tasks provide methods that would return such entities, only when they were loaded in memory. Consumers of those methods sometimes assume that an instance would be returned no matter what, causing the nullpointer exceptions.

Apart from fixing the code, intent should be expressed better, by renaming methods and providing better documentation.

Fixed

Details

Assignee

Reporter

Fix versions

Priority

Created September 9, 2020 at 3:02 PM
Updated December 15, 2020 at 12:33 PM
Resolved December 15, 2020 at 12:33 PM

Flag notifications