Commit Graph

129 Commits

Author SHA1 Message Date
Derek Collison
1ea4a430da If we fail to load an account while processing a stream assignment, send error back to metaleader.
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-07 14:23:12 -07:00
Derek Collison
44ada49b16 During repeated server restarts or failures consumer state could drift between replicas.
We now make sure to sync state of the replicas when a new leader takes over. We also update ack floors regardless of detection on pending list.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-02 08:20:29 -07:00
Matthias Hanel
cd602231ac [Fixed] missing unlock and added a warning trace (#2054)
* [Fixed] missing unlock and added a warning trace

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-03-31 19:22:19 -04:00
Derek Collison
bb7a8a5f79 Introduced default max ack pending for ack explicit.
Fixed a bug that would introduce performance degradation for durable consumers R>1.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-30 11:47:24 -07:00
Derek Collison
5a48369b4b Make sure to not delete streams on bad updates.
If an update was asssigned but failed at the stream group server we would send back the result which would always delete the stream.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-29 07:35:30 -07:00
Derek Collison
c564b18482 Protect against negative
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-26 05:28:00 -07:00
Derek Collison
5d6fe9e4b0 Check for subject overlaps after check for pre-existing
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-25 19:00:15 -07:00
Derek Collison
5d5de5925f Introduce a previous leader state in the raft layer to allow quicker responses when leaderless.
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-25 17:08:29 -07:00
Derek Collison
e53caee5e8 Enforce server limits even when dynamic limits for accounts in play.
We were not properly enforcing server limits. This commit will allow a server to enforce limits but still remain functional even at the JetStream level.
Also fixed a bug for RAFT replay that could cause instability.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-25 16:06:27 -07:00
Derek Collison
a627db9fc8 Do not request streaminfo from streams that are completely offline.
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-24 10:26:09 -07:00
Derek Collison
06803dafbf Tweak seq tracking for flow control, also fixup code
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-24 09:46:54 -07:00
Derek Collison
2ed53035ed Reworked flow control for sources and mirrors.
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-24 07:07:33 -07:00
Derek Collison
a75e8f8c80 Fix for an issue with multiple restarts that showed stalled and sometimes lost streams.
The issue was when a state was removed from a server and restarted it would catch up properly.
However upon cluster restart the system could exhibit strange behaviors. This was due to on
catchup not properly creating a meta snapshot when one was received, leaving no meta state to recover.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-22 20:06:38 -07:00
Derek Collison
0f548edcc6 Reduce sliding window for direct consumers and catchup stream windows.
Remove another possible wire blocking operation in raft.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-21 09:24:27 -07:00
Derek Collison
faa6dc85eb Fix for flapping test
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-20 11:16:40 -07:00
Derek Collison
8eefff2b3b Make sure the jetstream accounts use the name as the key to the map.
This prevents possible double adds under reload or restart scenarios.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-18 17:29:26 -07:00
Derek Collison
ee92cc9a5b Properly print when a stream is doing out of band catchup. Print node banner consistently
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-14 07:29:36 -07:00
Derek Collison
cbbe6dc9c5 Make API access determing system not available consistent.
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-14 06:18:04 -07:00
Derek Collison
2fa8668dd9 Only snap if needed
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-13 16:30:58 -05:00
Derek Collison
a3a35c0ddb Updated raft processing and dealing with remove peer.
Made sure to not remove us if we were remapped after the peer removal.
Fixed some raft behaviors.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-13 16:28:24 -05:00
Derek Collison
2fb2ced712 Removed unused functions
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-13 16:28:24 -05:00
Derek Collison
299f44cddf This changes our behaviors for streams and peer removals in several ways.
First we no longer try to auto-remap stream assignments on peer removals from the system.
We also now can always respond to stream info requests if at least a member is running.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-11 06:52:28 -05:00
Derek Collison
01404b3dc9 Protect against cluster and meta being gone
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-10 22:33:39 -05:00
Derek Collison
e5e8205fac Need to make sure order of clseq as stamped also make it to the propose chan.
However we do not want to hold the actual stream lock.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-09 00:34:33 -06:00
Derek Collison
673543c180 Modified flow control for clustered mode.
Set channels into and out of RAFT layers to block.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-08 12:58:57 -06:00
Derek Collison
d31fda5dac Added code to constrain size of WAL under most scenarios.
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-06 08:38:56 -08:00
Ivan Kozlovic
4e3b79f62b monitorConsumer perform snapshot similar to monitorStream
Changed the stream min size default value back to 32MB and removed
the one for consumer since we don't use it anymore but set the
count size same than for stream (8192).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-03-05 19:02:41 -07:00
Derek Collison
0b3c686430 Fixes for data races and some locking.
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-05 17:19:51 -08:00
Derek Collison
dd8acb1a99 Fixed a bug where we were not determing clustered state so were straight processing msgs from routes.
Cleaned up lseq and clseq code.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-05 12:00:19 -08:00
Derek Collison
7b1b9a7946 Snapshot on peer state change, e.g. removal
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-04 18:52:57 -08:00
Derek Collison
207ebd3b3d Changed stream sendq to linked list outq.
Made consumer share streams outq.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-04 17:19:50 -08:00
Derek Collison
d7201a110b Better handling on out of disk.
Suppress some stream and consumer bad results since they delete the asset.
Allow rehup to re-enable JetStream.
Various bug fixes and improvements.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-03 20:12:10 -08:00
Ivan Kozlovic
0f53bf6580 Fixed data race with nodeInfo
Took the approach of storing struct instead of pointer. Of course,
when changing the offline bool from false to true, it means that
we need to call Store again (with same key).

This is based on the assumption that those Load/Store are not too
frequent. Otherwise, we may need to use locking (and keep *nodeInfo)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-03-03 13:28:45 -07:00
Matthias Hanel
25ef6b0f0d Merge pull request #1952 from nats-io/goland-lint
Fixed linter issues
2021-03-02 21:43:04 -05:00
Matthias Hanel
c50ee2a1c6 [Changed] all times exposed will be computed in UTC (#1943)
This also applies to times that end up in that json.
Where applicable moved time.Now() to where it is used.
Moved calls to .UTC() to where time is created it that time is converted
later anyway.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-03-02 21:37:42 -05:00
Matthias Hanel
4f2db7d187 Fixed linter issues
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-03-02 20:21:44 -05:00
Derek Collison
2e7fdf2ef8 Only updateDelivered needs to be suppressed for leaders
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-02 07:03:32 -08:00
Derek Collison
9e181b8d0d Consumers were double processing as leaders
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-01 18:37:35 -08:00
Derek Collison
f16d9c6ea8 Don't forget last message
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-01 12:52:54 -08:00
Derek Collison
df77724aa4 Make ephemeral consumers R=1 and provide optimistic migration on peer removal or server shutdown.
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-28 16:50:25 -08:00
Derek Collison
03954eedc6 Enable cluster server removal API.
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-28 14:14:36 -08:00
Derek Collison
e0d08e1a22 Check for stream updates and disallow changes to mirrors and replicas for now.
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-28 12:04:40 -08:00
Derek Collison
b9e1a921ff Use internal wildcard inbox for stream and consumer info requests.
More gateway friendly but suffers from no echo, so added new client based internal send.

Signed-off-by: Derek Collison <derek@nats.io>
2021-02-28 10:01:01 -08:00
Derek Collison
ef4567f24a Changes for sources and mirrors improvements.
Better handling of messages on restart from a WAL.

Signed-off-by: Derek Collison <derek@nats.io>
2021-02-28 05:18:48 -08:00
R.I.Pienaar
a4817bd7b6 extend the out of space advisory
Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-02-26 11:10:05 +01:00
Derek Collison
c6672260af Merge pull request #1937 from nats-io/wio
[FIXED] Bug where followers would not snapshot/compact WAL.
2021-02-25 21:14:52 -07:00
Derek Collison
98f98e214b Properly support memory based WALs
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-25 19:49:54 -08:00
Derek Collison
0f69e48511 Bug check err, check for out of space on catchup
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-25 18:25:16 -08:00
Derek Collison
e5c8774172 Handle out of space situations, general stability enhancements
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-25 17:54:29 -08:00
Derek Collison
a862cc75cc Suppress raft campaigns on restart. Extend election timeout interval.
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-25 04:14:14 -08:00