Cache data inconsistency: incoming server sessions

Description

This cache data inconsistency was found after deploying the second snapshot that includes the MUC/Clustering reimplementation.

The system had ran with an updated version of the monitoring plugin on one node (the other node did not seem to have the plugin at all). The plugin was experiencing problems, as reported in https://github.com/igniterealtime/openfire-monitoring-plugin/issues/198

We left this running overnight.

In the morning, the node that had the plugin was sluggish. Logins to the admin console failed. From the log files, it was apparent that the database connection pool was exhausted, causing all kinds of exceptions. Many of the stack traces could be tracked down to RRD-based statistics (which is a feature of the Monitoring plugin).

Unloading the monitoring plugin did not resolve the issue.

The node that was experiencing the problem was restarted. This restored functionality. However, after that, both nodes did show a cache inconsistency (before the start, the second node did not show this inconsistency. The first node couldn't be checked, as logging into the admin console was impossible).

Cache inconsistency on node 1 (the node that got restarted):

Cache inconsistency on node 2:

Environment

None

Attachments

1

Activity

Show:

Guus der Kinderen October 28, 2021 at 9:44 AM

The node that is restarted reports: “incomingServerSessionsByClusterNode tracks data for cluster nodes that are not recognized” That data structure tracks data for the nodeID that was associated to the node itself, prior to its restart.

Fixed

Details

Assignee

Reporter

Components

Fix versions

Affects versions

Priority

Created October 28, 2021 at 8:29 AM
Updated October 28, 2021 at 6:53 PM
Resolved October 28, 2021 at 6:53 PM