Derek Collison
2d21bc7008
Fix datarace
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-10-03 15:35:20 -07:00
Derek Collison
dba03dbc2f
Optimizations to reduce contention for high connections in a JetStream enabled account with high API usage.
...
Several strategies which are listed below.
1. Checking a RaftNode to see if it is the leader now uses atomics.
2. Checking if we are the JetStream meta leader from the server now uses an atomic.
3. Accessing the JetStream context no longer requires a server lock, uses atomic.Pointer.
4. Filestore syncBlocks would hold msgBlock locks during sync, now does not.
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-30 14:52:15 -07:00
Derek Collison
f95ef63ae1
In lameduck mode shutdown jetstream at start, do not leave running during connection drain.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-24 14:42:59 -07:00
Neil Twigg
1f9ddf2bbd
Add Raft goroutine labels, tweak logging
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-16 11:15:06 +01:00
Derek Collison
f1bf4127c5
Merge branch 'main' into dev
2023-08-25 11:03:54 -07:00
Derek Collison
e5625b9d9b
If a leader is asked for an item and we have no items left, make sure to also step-down.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-08-25 10:20:07 -07:00
Derek Collison
fd50bc2918
Merge branch 'main' into dev
2023-08-24 21:10:22 -07:00
Derek Collison
2669f77190
Make sure to reset election timer on catching up
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-08-24 19:58:08 -07:00
Derek Collison
19eba1b8c8
Merge branch 'main' into dev
2023-06-08 09:34:41 -07:00
Neil Twigg
6d9955d212
Send peer state when adding peers
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-06-08 15:25:18 +01:00
Derek Collison
30d9dfd305
Merge branch 'main' into dev
2023-06-03 18:17:28 -07:00
Derek Collison
238282d974
Fix some data races detected in internal testing
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-03 13:58:15 -07:00
Derek Collison
ee87df250c
Merge branch 'main' into dev
2023-05-17 19:27:58 -07:00
Derek Collison
8e825001d2
When we receive a catchup request for an item beyond our current state, we should stepdown.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-17 17:30:35 -07:00
Derek Collison
990ac56557
Merge branch 'main' into dev
2023-05-10 15:31:54 -07:00
Derek Collison
a17357c6ae
When doing leadership transfer stepdown as soon as we know we have sent the EntryLeaderTransfer entry.
...
Delaying could allow something to be sent from the old leader which would cause the new leader to bail on being a candidate even though it would have gotten all the votes.
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-10 12:27:33 -07:00
Derek Collison
717afae9ef
When doing a leader transfer clear vote state on leader and when non-chosen peers receive the update
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-10 07:49:22 -07:00
Derek Collison
2f2440f270
Merge branch 'main' into dev
2023-05-09 20:11:53 -07:00
Derek Collison
b9af0d0294
Only do no-leader stepdown on transfer after a delay if we are still the leader
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-09 17:19:14 -07:00
Ivan Kozlovic
311e3feb5f
Merge branch 'main' into dev
2023-05-03 17:38:40 -06:00
Derek Collison
ae73f7be55
Small raft improvements.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-02 16:44:27 -07:00
Derek Collison
0321eb6484
Merge branch 'main' into dev
2023-04-29 19:52:57 -07:00
Derek Collison
546dd0c9ab
Make sure we can recover an underlying node being stopped.
...
Do not return healthy if the node is closed, and wait a bit longer for forward progress.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-29 07:42:23 -07:00
Derek Collison
4ebdb69daf
Merge branch 'main' into dev
2023-04-26 11:34:37 -07:00
Derek Collison
3c964a12d7
Migration could be delayed due to transferring leadership while the new leader was still paused.
...
Also check quicker but slow down if the state we need to have is not there yet.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-25 18:58:49 -07:00
Waldemar Quevedo
d9cc8b0363
fix formatting of raft debug log
...
Signed-off-by: Waldemar Quevedo <wally@nats.io >
2023-04-22 07:07:08 +02:00
Derek Collison
3b3fac297a
Merge branch 'main' into dev
2023-04-15 14:21:39 -07:00
Derek Collison
a5f5603645
Reset our WAL on edge conditions instead of trying to recover.
...
Also if we are timing out and trying to become a candidate but are doing a catchup check if we are stalled.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-15 12:23:44 -07:00
Derek Collison
8375ab5cde
Merge branch 'main' into dev
2023-04-14 16:44:25 -07:00
Derek Collison
66ca46e145
If we see another leader with same term we should step down
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-14 16:21:40 -07:00
Derek Collison
a319d24345
Merge branch 'main' into dev
2023-04-13 21:03:05 -07:00
Waldemar Quevedo
a4833d0889
Fix raft log debug reloading
...
Signed-off-by: Waldemar Quevedo <wally@nats.io >
2023-04-13 14:57:04 -07:00
Derek Collison
808a2e8c90
On failure to send snapshot to follower, also reset, and on reset make sure to reset term
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-12 11:48:22 -07:00
Derek Collison
a92bb9fe61
Fix bad unlock which could cause crash
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-12 11:48:22 -07:00
Derek Collison
340fcc90bc
Basic raft tests
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-12 11:48:22 -07:00
Derek Collison
dfeac4a214
Merge branch 'main' into dev
2023-04-09 19:31:01 -07:00
Derek Collison
80a57a3d51
Remove peers from string intern map
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-09 08:01:36 -07:00
Derek Collison
6fa55540a7
Better us of entryPool
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-09 07:48:31 -07:00
Derek Collison
35bb7c1737
Pool CommittedEntries as well with a ReturnToPool() that will also recycle the Entry. Needs to integrate with upper layers
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-08 11:34:10 -07:00
Derek Collison
3be25fdedb
Do not put an appendEntryResponse back in the pool if catching up until complete
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 10:30:06 -07:00
Derek Collison
2ff6f18ccd
Use sync.Map for peers vs internal storage for appendEntryResponses
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 08:16:42 -07:00
Derek Collison
1caa56a34f
Use pools for appendEntries
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 07:38:19 -07:00
Derek Collison
3afdb99f75
Use pools for appendEntryResponses. Also use interior space for peer name from the wire
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-07 06:43:51 -07:00
Derek Collison
ff8701b724
Merge branch 'main' into dev
2023-04-06 08:37:11 -07:00
Derek Collison
e76b0b9b96
Move check for out of resources which would want a read lock out of inline processing
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-05 20:28:19 -07:00
Derek Collison
1ae51b23a9
[ADDED] Multiple routes and ability to have per-account routes ( #4001 )
...
New configuration fields:
```
cluster {
...
pool_size: 5
accounts: ["A", "B"]
}
```
The configuration `pool_size` in the example above means that this
server will create 5 routes to a remote server, assuming that that
server has the same `pool_size` setting.
Accounts (which are not part of the `accounts[]` configuration)
are assigned a specific route in this pool, and this will be the
same route on all servers in the cluster.
Accounts that are defined in the `accounts` field will each have
a dedicated route connection. This will allow suppression of the
account name in some of the route protocols, reducing bytes transmitted
which may increase performance.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2023-04-03 15:33:46 -07:00
Derek Collison
b806a8e7e7
Do not opt-out of normal processing for leadership transfers, but make sure they are only processed if explicitly new
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-03 14:46:55 -07:00
Ivan Kozlovic
105237cba8
[ADDED] Multiple routes and ability to have per-account routes
...
New configuration fields:
```
cluster {
...
pool_size: 5
accounts: ["A", "B"]
}
```
The configuration `pool_size` in the example above means that this
server will create 5 routes to a remote server, assuming that that
server has the same `pool_size` setting.
Accounts (which are not part of the `accounts[]` configuration)
are assigned a specific route in this pool, and this will be the
same route on all servers in the cluster.
Accounts that are defined in the `accounts` field will each have
a dedicated route connection. This will allow suppression of the
account name in some of the route protocols, reducing bytes transmitted
which may increase performance.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2023-04-03 09:32:25 -06:00
Derek Collison
58ca525b3b
Process replicated ack regardless of store update. Delay but still stepdown
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-02 03:53:16 -07:00
Derek Collison
874b2b2e02
Hold the lock while checking health since we could update catchup state.
...
Do not stepdown right away when executing leadership transfer, wait for the commit.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-02 03:53:08 -07:00