We needed to make sure at the lowest level that the state was read from disk and not depend on upper layer consumer.
Signed-off-by: Derek Collison <derek@nats.io>
This PR effectively reverts part of #4084 which removed the coalescing
from the outbound queues as I initially thought it was the source of a
race condition.
Further investigation has proven that not only was that untrue (the race
actually came from the WebSocket code, all coalescing operations happen
under the client lock) but removing the coalescing also worsens
performance.
Signed-off-by: Neil Twigg <neil@nats.io>
This is specifically when a cluster is reconfigured and the servers are
restarted with a new cluster name.
Signed-off-by: Derek Collison <derek@nats.io>
Originally I thought there was a race condition happening here,
but it turns out it is safe after all and the race condition I
was seeing was due to other problems in the WebSocket code.
Signed-off-by: Neil Twigg <neil@nats.io>
In single server mode healthz could mistake a snapshot staging
direct…ory during a restore as an account.
If the restore took a long time, stalled, or was aborted, would cause
healthz to fail.
Signed-off-by: Derek Collison <derek@nats.io>
This is specifically when a cluster is reconfigured and the servers are restarted with a new cluster name.
Signed-off-by: Derek Collison <derek@nats.io>
This is a simpler way to determine if we need to consider a snapshot that involves much less time and CPU and memory.
Signed-off-by: Derek Collison <derek@nats.io>
Also if we are timing out and trying to become a candidate but are doing a catchup check if we are stalled.
Signed-off-by: Derek Collison <derek@nats.io>
When adding or updating sources/mirrors, server was checking if the
stream with a given name exists to check for subject overlaps, among
other things.
However, if sourced/mirrored stream was `External`, checks should not be
executed, as not only stream would never be found, but also, if
`External` stream had the same name as the sourcing stream, the check
would be wrongly performed against itself.
cc @jnmoyne
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
When adding or updating sources/mirrors, server was checking if the stream with
a given name exists to check for subject overlaps, among other things.
However, if sourced/mirrored stream was `External`, checks should
not be executed, as not only stream would never be found,
but also, if `External` stream had the same name as the sourcing stream,
the check would be wrongly performed against itself.
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
Fixes this test [TestJetStreamClusterDeleteAndRestoreAndRestart] which would flap since it would not snapshot since hash was same but had entries that would erase stream data.
Signed-off-by: Derek Collison <derek@nats.io>
Make sure to process consumer entries on recovery in case state was not
committed.
And sync other consumers when taking over as leader but no need to
process snapshots when we are in fact the leader.
Do not let the consumer redelivered count go down on re-applying state
on restarts.
Signed-off-by: Derek Collison <derek@nats.io>
Also sync other consumers when taking over as leader but no need to process snapshots when we are in fact the leader.
Signed-off-by: Derek Collison <derek@nats.io>
Improve performance on storing msgs when multiple subjects exists with
multiple messages and we have store limits that are being hit.
This will not fix any performance degradation per se when per subject
limits are being hit constantly across a vast array of keys. That will
be addressed more so in 2.10.
Resolves#4043
Signed-off-by: Derek Collison <derek@nats.io>
Some write operations in the filestore require re-encrypting the
entire block, so having large blocks can make these operations
slow and hurt performance.
Signed-off-by: Neil Twigg <neil@nats.io>