The current implementation stores all published items to a node in memory. This is not scalable and will easily cause out of memory errors as items published to all nodes accumulates. The larger the payload, the quicker this will happen.
I would suggest that there are 2 possible solutions.
1. Store only the id in long term memory. The published items would only be stored in memory while they are waiting to be persisted. This will drastically reduce the memory footprint while allowing easy checks to see if the same item is being overwritten.
2. Store only the last item in memory. The check for overwriting an existing record can be done and handled (when there is an id match) in the persistence manager. This would maintain no id's or items in memory (other than the last item published).
All in memory storage of published items has been removed from the pubsub and pep service, as well as the leaf nodes.
All responsibility for persistence has been moved into the PubsubPersistenceManager, as well as for any retrieval operations.
I would speculate that this will eliminate the memory leak related to pep as well, as it was likely caused by the same problem mentioned here.
Basically, if a persistent node is defined with a large or infinite size, it will continue to store all items for that node in memory. Enough data will cause IndexOutOfBounds exceptions in internal queues, or a large number of nodes will cause OutOfMemoryErrors.
Both of these issues should now be handled properly.
hi,
my pubsub content are lost every login, it's normal?
If the node is persistent, then the content is stored in the database and retrievable by a request to get the items.
how i can seti is persistent?
It is part of the node configuration. Look up #persist_items.
http://xmpp.org/extensions/xep-0060.html#owner-configure