Derek Collison
0d29b0761a
Tweaked buffered channels, moved locks for snapshots.
...
Also placed debug for inline processing of append entries.
This is for removal of that inline.
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-28 05:16:04 -08:00
R.I.Pienaar
a4817bd7b6
extend the out of space advisory
...
Signed-off-by: R.I.Pienaar <rip@devco.net >
2021-02-26 11:10:05 +01:00
Derek Collison
98f98e214b
Properly support memory based WALs
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-25 19:49:54 -08:00
Derek Collison
0f69e48511
Bug check err, check for out of space on catchup
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-25 18:25:16 -08:00
Derek Collison
b13ef6b9ec
Track write errors. Fixed a few bugs.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-25 17:53:20 -08:00
Derek Collison
a862cc75cc
Suppress raft campaigns on restart. Extend election timeout interval.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-25 04:14:14 -08:00
Derek Collison
73ba2d0b2f
File writes to term and vote and peerstate were in the direct route path and could cause delays.
...
This moves the actual writes to a separate Go routine and also allows multiple writes to
be compressed into one write under load. We only want latest.
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-24 20:47:31 -08:00
Derek Collison
78bdc34637
General stability improvements. Fixes to subscription state not cleaning up.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-24 08:44:34 -08:00
Ivan Kozlovic
1652fe62ef
Updates to when do snapshot
...
Remove panic on runAsLeader when not able to subscribe (which happens
on shutdown)
Gateway name access does not need lock since it is immutable. Will
prevent deadlocks in some situations.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2021-02-23 19:06:07 -07:00
Derek Collison
8fe8b835fe
Fixes for flapping tests
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-23 14:08:17 -08:00
Derek Collison
c39641c263
Tweak hb and election times, fix unsubscribe leak
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-23 10:57:05 -08:00
Derek Collison
fa8a74ceb5
Allow placement directives for metacontroller stepdown to allow placement to new clusters.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-19 10:55:22 -08:00
Derek Collison
9de18dfefe
Removed unused function
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-18 18:35:44 -08:00
Derek Collison
048011d7f1
Split vote improvements
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-18 18:29:18 -08:00
Derek Collison
89fe3b05df
various bug fixes, wal/snapshot stability
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-18 08:41:09 -08:00
Derek Collison
e21c7097f3
General stability improvements.
...
Original thought to move to memory based WALs was ill-advised and caused issues with stability around restarts.
Returned to file based but with async flush for the WAL itself.
Also the raft inline catchup has been improved.
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-17 19:56:16 -08:00
Derek Collison
765b9ad57a
Some stability improvements to raft lib and catchup stream processing.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-16 20:30:12 -08:00
Derek Collison
ddc4cc79d2
Make sure to not process AR when no longer leader
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-16 15:58:46 -08:00
Derek Collison
0dcb006968
Handle AppendEntry reponse inline, lower outstanding on catchup to stabilize
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-16 13:24:09 -08:00
Derek Collison
4c6e33c9c6
Restoration of streams would possibly block route and client connections.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-14 18:43:40 -08:00
Derek Collison
f0cfc187d2
Set pindex to wrong setting on snapshot restore with no WAL
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-13 06:50:50 -08:00
Derek Collison
4759560e29
Fixed raft bug on catchup logic with external snapshots
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-12 19:58:02 -08:00
Derek Collison
579737a5e1
General fixes, stability improvements
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-11 18:13:24 -08:00
Derek Collison
fa8a95a06a
Improved snapshots and compactions.
...
Various bug fixes and stability improvements.
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-11 11:16:00 -08:00
Derek Collison
92d64c2bcc
Reset WAL on mismatch catchup regardless, condition ok
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-07 09:30:13 -08:00
Derek Collison
a16affedca
Always reset election timeout on vote request
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-07 08:09:01 -08:00
Derek Collison
74a4c531c9
Stability improvements.
...
Changes to catchup logic, peer tracking, and vote responses.
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-06 20:13:18 -08:00
Derek Collison
c49e3247bb
Purge operations would be replayed on restart regardless if they had already been processed.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-04 07:04:50 -08:00
Derek Collison
a1e0f7dc1a
First pass at supercluster enablement.
...
This allows metacontrollers to span superclusters. Also includes placement directives for streams. By default they select the request origin cluster.
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-03 17:28:13 -08:00
Derek Collison
a8982c040f
Suppress lost quorum processing if to close to raft node creation time.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-02 06:27:07 -08:00
Derek Collison
f3703a4b85
Make sure audit events have the proper subject regardless of where processed.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-02 05:03:20 -08:00
Derek Collison
e5c1d65fff
Added in JS disable per server on reload. Also removing peerw from a stream and leader stepdown for streams and consumers.
...
Various bug fixes, stability improvments.
Signed-off-by: Derek Collison <derek@nats.io >
2021-02-01 19:39:08 -08:00
Derek Collison
2b0717bde2
Make debug not error since we recover
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-30 14:00:26 -08:00
Derek Collison
9b20d5c888
Fixed bug on raft inline cacthup when apply channel was full.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-30 13:22:27 -08:00
Derek Collison
457ca3b9cf
Suppress additional advisories on server restart and leadership changes.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-29 15:08:22 -08:00
Derek Collison
9d4951d2bb
Updated lost quorum signalling to be less fragile.
...
We will now alert when the old leader detects a lost quorum just as before, but also detect if a candidate is flapping and failing to get votes because of no quorum.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-28 09:27:17 -08:00
Derek Collison
8b79114168
Add in advisories for leader elected and quorum lost advisories.
...
Note that quorum lost only fires if the old leader steps down.
If the leader itself fails and that causes the loss of quorum currently no advisory is sent.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-28 08:37:54 -08:00
Derek Collison
a9b8948abe
Add in tracking for quorum in raft and do auto stepdown.
...
Also added in API responses when no leader is present for meta, streams and consumers.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-27 13:34:00 -08:00
Derek Collison
c0ae719629
Don't load entry for snapshot, fix data race
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-26 19:26:03 -08:00
Derek Collison
054319a662
Fix for split vote bug
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-26 14:59:13 -08:00
Derek Collison
3e8d295239
Make sure to not go backwards on applied or commit indexes
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-26 14:07:52 -08:00
Derek Collison
bcd38bba96
Make sure stepdown logic does not block system
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 19:20:10 -08:00
Derek Collison
d278996272
LDM trigger to move raft leaders
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 16:52:19 -08:00
Derek Collison
7eb6d07bfc
On stepdown still process appendEntry
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 14:32:24 -08:00
Derek Collison
7d8c3eaa6e
Don't pre-vote, causes flapping on split vote
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 13:49:20 -08:00
Derek Collison
5148bbf898
Fixes based on PR feedback, cleanup
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 10:04:21 -08:00
Derek Collison
7b1e84c086
Fixed raft bug that would cause entries to be missed on restart with leader HB trigger.
...
Also added in creation times to stream and consumer assignments to make them consistent.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 08:47:37 -08:00
Derek Collison
117607ef11
Fix for race and test for issue R.I. was seeing in nightly. Also fixed flappers.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-24 21:21:02 -08:00
Derek Collison
9c858d197a
Added ability to properly restore consumers from a snapshot.
...
This made us add forwarding proposals functionality in the raft layer.
More general cleanup and bug fixes as well.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-24 19:30:34 -08:00
Derek Collison
cad0db2aec
Cleanup the consumer assignments when consumers become inactive.
...
This involved extending our raft implementation to forward proposals to the current leader.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-23 13:44:10 -08:00