This can happen when we reset a stream internally and the stream had a prior snapshot.
Also make sure to always release resources back to the account regardless if the store is no longer present.
Signed-off-by: Derek Collison <derek@nats.io>
This is specifically when a cluster is reconfigured and the servers are restarted with a new cluster name.
Signed-off-by: Derek Collison <derek@nats.io>
This PR has general improvements and fixes to filestore, raft, and the
clustering layer.
Summary
1. Additional support for preAck handling for interest based streams
when replicated acks arrive before the message itself.
2. Better handling when checking state to determine whether to remove an
interest based message.
3. Improved StepDown() and leadership transfer handling after restarts.
4. Improved voting logic for high load systems.
5. Various improvements and fixes for filestore Compact(), which is used
heavily in the raft layer when updating snapshots and the raft wal.
Signed-off-by: Derek Collison <derek@nats.io>
1. If reset ignore Applied() that are greater then our commit.
2. Improved StepDown() by placing at back of queue if preferred.
3. Improved handling of leadership transfer during StepDown().
4. Do not store EntryLeaderTransfer records on disk.
5. Remove un-needed processing of older terms.
6. If append entry has higher term, also inherit pterm.
7. Only inherit a candidate's term if we decide to vote for them.
Signed-off-by: Derek Collison <derek@nats.io>
This could lead to instability in the system.
The bug would manifest in replicated consumers when certain messages could be acked out of order, and, the pending list would never go to zero.
Signed-off-by: Derek Collison <derek@nats.io>
This commit fixes the issue when scaling Stream with MaxAge
and some older messages stored. Until now, old messages were not properly
expired on new replicas, because new replicas first expiry timer
was set to MaxAge duration.
This commit adds a check if received messages expiry happens before
MaxAge, meaning they're messages older than the replica.
https://github.com/nats-io/nats-server/issues/3848
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
If there was a spurious error on restart, or possibly on an update, we could delete a consumer which was the incorrect behavior.
Signed-off-by: Derek Collison <derek@nats.io>
When js-enabled is set to true, the condition was only checked if
the `getJetStream()` call returned `nil`. However, if it non-nil,
all remaining checks were executed, including assessing the health
of the assets (streams and consumers).
This change addresses two issues:
- Switch to use `js.isEnabled()` which will check whether the value
is nil OR `js.disabled = true` which can occur if the subsystem
is temporarily disabled (insufficient resources).
- Correctly exit the check after the assertion and before meta and
asset checks are performed.
In addition, the option has been renamed to `js-enabled-only` to align
with the `js-server-only` naming. The previous `js-enabled` name still
works, but is mapped to this new option. A warning is emitted noting
the previous option is deprecated.
Fix#3703
Signed-off-by: Byron Ruth <b@devel.io>