Cache state inconsistencies after Netty upgrade

Description

This issue affects a version of Openfire that is as of yet unreleased. It should not affect anyone running a proper release of Openfire. It is likely introduced by .

Since migrating from MINA to Netty (OF-2559) the IgniteRealtime server is constantly warning admins about cache state inconsistencies.

It was initially my hope that these are a symptom of another issue, and that they’d resolve themselves when other issues were fixed. Having fixed many of those issues, and the warnings still occurring, I’m no longer as hopeful.

These cache state inconsistencies should be investigated, and fixed.

These inconsistencies are reported to server admins periodically. The cache state can be verified manually by opening the JSP page system-clustering-data-consistency-check.jsp in the admin console (this page is not linked in a menu, it needs to be manually added to an URL). Note that opening this page can add considerable overhead. It is not recommended to do this in high-load production environments.

Environment

None

relates to

Activity

Show:

Guus der Kinderen October 21, 2023 at 5:14 PM

WIth Dan's comment suggesting that the remaining issue was tied to cache expiry, the problem was identified. A cache was expected to use a never-expire configuration, but did not. When cleaning up a connection, the cleanup routine depends on a cache entry to be present. If that entry expired earlier, then the cleanup couldn't do its thing, resulting in the cache inconsistency warnings.

Dan Caseley October 20, 2023 at 5:07 PM

Twice this week, cache inconsistencies have turned up just over 6 hours after a reboot of Ignite’s server.

Searching Ignite’s logs shows that the last mention of any of the missing keys is just after server reboot.

The Incoming Server Session Info Cache has a max item age of 6 hours. Is it likely that these are being naturally aged out of the cache?

Guus der Kinderen October 13, 2023 at 8:00 AM
Edited

Over the last few weeks I’ve quite reliably only seen warnings for the “Incoming Server Session Info Cache”. I suspect that all other reported inconsistencies have gone away when we resolved other issues.

An example of the reported issue is this:

Not all elements in SessionManager's localSessionManager exist in Incoming Server Session Info Cache. These 5 entries do not: 7sm8ivgs06, 7y0xc3jyfj, 73kb5pzs1a, 43k2oc00sf, 3cbxktmqzu

Guus der Kinderen September 22, 2023 at 12:30 PM

The ‘unexpected’ state of the Client Session Info Cache seems to already be there in Openfire 4.7.5. I cannot explain why we did not see warnings for that before.

Guus der Kinderen September 22, 2023 at 8:23 AM

With the changes in and the warnings realting to the S2S domain pairs and Routing Servers Cache have gone away.

What remains are warnings related to:

  • Incoming Server Session Info Cache

  • Client Session Info Cache

The corresponding messages are:

  • Not all elements in SessionManager's localSessionManager exist in Incoming Server Session Info Cache.

  • Not all elements in SessionManager's localSessionManager exist in Client Session Info Cache.

Fixed

Details

Assignee

Reporter

Fix versions

Affects versions

Priority

Created September 7, 2023 at 2:39 PM
Updated October 21, 2023 at 5:14 PM
Resolved October 21, 2023 at 5:14 PM