Derek Collison
e5625b9d9b
If a leader is asked for an item and we have no items left, make sure to also step-down.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-08-25 10:20:07 -07:00
Derek Collison
2669f77190
Make sure to reset election timer on catching up
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-08-24 19:58:08 -07:00
Neil Twigg
6d9955d212
Send peer state when adding peers
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-06-08 15:25:18 +01:00
Derek Collison
238282d974
Fix some data races detected in internal testing
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-03 13:58:15 -07:00
Derek Collison
8e825001d2
When we receive a catchup request for an item beyond our current state, we should stepdown.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-17 17:30:35 -07:00
Derek Collison
a17357c6ae
When doing leadership transfer stepdown as soon as we know we have sent the EntryLeaderTransfer entry.
...
Delaying could allow something to be sent from the old leader which would cause the new leader to bail on being a candidate even though it would have gotten all the votes.
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-10 12:27:33 -07:00
Derek Collison
717afae9ef
When doing a leader transfer clear vote state on leader and when non-chosen peers receive the update
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-10 07:49:22 -07:00
Derek Collison
b9af0d0294
Only do no-leader stepdown on transfer after a delay if we are still the leader
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-09 17:19:14 -07:00
Derek Collison
ae73f7be55
Small raft improvements.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-02 16:44:27 -07:00
Derek Collison
546dd0c9ab
Make sure we can recover an underlying node being stopped.
...
Do not return healthy if the node is closed, and wait a bit longer for forward progress.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-29 07:42:23 -07:00
Derek Collison
3c964a12d7
Migration could be delayed due to transferring leadership while the new leader was still paused.
...
Also check quicker but slow down if the state we need to have is not there yet.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-25 18:58:49 -07:00
Waldemar Quevedo
d9cc8b0363
fix formatting of raft debug log
...
Signed-off-by: Waldemar Quevedo <wally@nats.io >
2023-04-22 07:07:08 +02:00
Derek Collison
a5f5603645
Reset our WAL on edge conditions instead of trying to recover.
...
Also if we are timing out and trying to become a candidate but are doing a catchup check if we are stalled.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-15 12:23:44 -07:00
Derek Collison
66ca46e145
If we see another leader with same term we should step down
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-14 16:21:40 -07:00
Waldemar Quevedo
a4833d0889
Fix raft log debug reloading
...
Signed-off-by: Waldemar Quevedo <wally@nats.io >
2023-04-13 14:57:04 -07:00
Derek Collison
808a2e8c90
On failure to send snapshot to follower, also reset, and on reset make sure to reset term
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-12 11:48:22 -07:00
Derek Collison
a92bb9fe61
Fix bad unlock which could cause crash
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-12 11:48:22 -07:00
Derek Collison
340fcc90bc
Basic raft tests
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-12 11:48:22 -07:00
Derek Collison
80a57a3d51
Remove peers from string intern map
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-09 08:01:36 -07:00
Derek Collison
6fa55540a7
Better us of entryPool
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-09 07:48:31 -07:00
Derek Collison
35bb7c1737
Pool CommittedEntries as well with a ReturnToPool() that will also recycle the Entry. Needs to integrate with upper layers
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-08 11:34:10 -07:00
Derek Collison
3be25fdedb
Do not put an appendEntryResponse back in the pool if catching up until complete
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 10:30:06 -07:00
Derek Collison
2ff6f18ccd
Use sync.Map for peers vs internal storage for appendEntryResponses
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 08:16:42 -07:00
Derek Collison
1caa56a34f
Use pools for appendEntries
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 07:38:19 -07:00
Derek Collison
3afdb99f75
Use pools for appendEntryResponses. Also use interior space for peer name from the wire
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 06:43:51 -07:00
Derek Collison
e76b0b9b96
Move check for out of resources which would want a read lock out of inline processing
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-05 20:28:19 -07:00
Derek Collison
b806a8e7e7
Do not opt-out of normal processing for leadership transfers, but make sure they are only processed if explicitly new
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-03 14:46:55 -07:00
Derek Collison
58ca525b3b
Process replicated ack regardless of store update. Delay but still stepdown
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-02 03:53:16 -07:00
Derek Collison
874b2b2e02
Hold the lock while checking health since we could update catchup state.
...
Do not stepdown right away when executing leadership transfer, wait for the commit.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-02 03:53:08 -07:00
Derek Collison
4646f4af5d
Do not allow any JetStream leaders to be placed on a lameduck server
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-29 20:15:41 -07:00
Derek Collison
e274693490
On bad or corrupt message load during commit, reset WAL vs mark write error
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-29 14:07:14 -07:00
Derek Collison
35d1a7747a
Snapshots of no length can hold state as well
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-29 12:44:04 -07:00
Derek Collison
182bf6cbae
Bug fixes and general stability improvements.
...
1. If reset ignore Applied() that are greater then our commit.
2. Improved StepDown() by placing at back of queue if preferred.
3. Improved handling of leadership transfer during StepDown().
4. Do not store EntryLeaderTransfer records on disk.
5. Remove un-needed processing of older terms.
6. If append entry has higher term, also inherit pterm.
7. Only inherit a candidate's term if we decide to vote for them.
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-29 12:43:46 -07:00
Derek Collison
ec89823e1c
Only process out of resources condition from raft layer if err matches condition
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-23 08:13:22 -07:00
Derek Collison
ed9de4b0a1
Improved publisher performance under some instances of asymmetric network latency clusters on interest based streams.
...
Under asymmetric network latency based clusters, if a node in an R3 was replicating a consumer and the parent stream, but was the leader of neither, but the path from the stream leader was faster then the consumer leader a replicated ack could arrive before the message itself.
In this case we used to forward a delete message request to the stream leader which would then replicate that to all stream replicas, causing more work which could lead to increased publisher times on clients connected to the slow node.
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-20 20:53:45 -07:00
Derek Collison
0c1301ec14
Fix for data race
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-19 10:52:52 -07:00
Derek Collison
531fadd3e2
Don't warn if error is node closed.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-15 16:45:33 -07:00
Derek Collison
2beca1a2a6
Partial cache errors are also not critical write errors
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-01 22:52:02 -08:00
Derek Collison
c586014477
General raft improvements under heavy corruption.
...
Do not exit candidate state in place when stepping down, would cause double vote requests.
When truncating our WAL make sure to adjust commit and applied as needed.
On a miss where the index is less than ours, if we can not find the entry reset our state.
For a vote, if last processed term is higher than ours always agree if no vote has been cast.
If terms are equal make sure the requestor's index is at least as high as ours.
If we decide not to vote for someone, and we have not voted and we are a better fit, move forward with a campaign.
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-01 22:06:50 -08:00
Derek Collison
fa8afba68f
Only warn on write errors if not closed in case they linger under pressure and blocking on dios
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-27 18:56:55 -08:00
Derek Collison
2711460b7b
Prevent benign spin between competing leaders with same index but differen term.
...
Remove lock from route processing for updating peers progress, altready handled in trackPeer.
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-27 11:21:33 -08:00
Derek Collison
4fa0ea32c3
[FIXED] If a truncate for a raft WAL failed we could spin.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-25 19:07:27 -08:00
Derek Collison
ea2bfad8ea
Fixed bug where snapshot would not compact through applied. This mean a subsequent request for exactly applied would return that entry only not the full state snapshot.
...
Fixed bug where we would not snapshot when we should.
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-23 22:19:37 -08:00
Derek Collison
45859e6476
Make sure preferred peer for stepdown is healthy.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-23 13:06:13 -08:00
Neil Twigg
68961ffedd
Refactor ipQueue to use generics, reduce allocations
2023-02-21 14:50:09 +00:00
Derek Collison
e028b7230a
Need to compact wal on snapshot to pindex+1
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-20 14:37:37 -08:00
Derek Collison
9c02be2409
Various fixes for snapshots.
...
Due to bug, in rare circumstances could write an empty snapshot for aplied == 0. This would cause a spinning at the raft layer.
1. Allow Truncate() to also properly do a reset of the store when terms were only mismatch.
2. During testing fixed memstore truncate and also made sure per subject info was also cleaned up.
3. Then added fix to detect a bad snapshot on initialization and remove.
4. Do not allow snapshots for applied == 0.
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-04 13:46:06 -08:00
Derek Collison
e9a983c802
Do not let !NeedSnapshot() avoid snapshots and compaction.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-01 22:05:25 -07:00
Derek Collison
6058056e3b
Minor fixes and optimizations for snapshots.
...
We were snappshotting more then needed, so double check that we should be doing this at the stream and consumer level.
At the raft level, we should have always been compacting the WAL to last+1, so made that consistent. Also fixed bug that would not skip last if more items behind the snapshot.
Signed-off-by: Derek Collison <derek@nats.io >
2023-01-30 17:54:18 -08:00
Derek Collison
bf49f23bb1
Only hold on to so many pending in memory, will fetch from WAL
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-01-28 11:34:55 -08:00