Derek Collison
5490c4969b
Fixed a bug that on restore of single streams in clustered mode would subscribe to the stream subjects twice.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-29 19:37:30 -08:00
Derek Collison
457ca3b9cf
Suppress additional advisories on server restart and leadership changes.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-29 15:08:22 -08:00
Derek Collison
3c49f087a0
Merge pull request #1859 from nats-io/jsc_ai
...
Extended AccountInfo to track API calls and errors.
2021-01-29 10:44:42 -07:00
Derek Collison
c889321a83
Change to API.Total and API.Errors
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-29 09:42:20 -08:00
Derek Collison
d2a92221fb
Duplicate leader elect and lost advisories to the system account as well.
...
Also suppress lost quorums to at most once every 10 secs.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-29 08:51:20 -08:00
Derek Collison
0a3124e27d
Track API calls per account. Track success and errors.
...
These tracking data are ephemeral per server. so on restart they reset.
That should be ok since these will most likely be used more for rates.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-28 17:16:50 -08:00
Derek Collison
9d4951d2bb
Updated lost quorum signalling to be less fragile.
...
We will now alert when the old leader detects a lost quorum just as before, but also detect if a candidate is flapping and failing to get votes because of no quorum.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-28 09:27:17 -08:00
Derek Collison
8b79114168
Add in advisories for leader elected and quorum lost advisories.
...
Note that quorum lost only fires if the old leader steps down.
If the leader itself fails and that causes the loss of quorum currently no advisory is sent.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-28 08:37:54 -08:00
Derek Collison
ab5f98c7b4
Help with flappers
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-28 05:23:51 -08:00
Derek Collison
6d27307453
Fix for broken stream restore functionality
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-27 17:44:42 -08:00
Derek Collison
132a4e7f7d
Allow memory store for clustering
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-27 15:54:39 -08:00
Derek Collison
a9b8948abe
Add in tracking for quorum in raft and do auto stepdown.
...
Also added in API responses when no leader is present for meta, streams and consumers.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-27 13:34:00 -08:00
Derek Collison
9b6dbe112c
Make sure randomServer() adapts for shutdown servers
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 20:14:11 -08:00
Derek Collison
27f8cbd069
Wait for interest
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 19:46:43 -08:00
Derek Collison
83e2c719b7
Wait in case stopped server was also stream leader
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 19:00:51 -08:00
Derek Collison
0e5e9cb5ee
Possible retry in case peers have not committed state
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 18:54:42 -08:00
Derek Collison
e40c3e6f55
Templates not supported currently in clustered mode
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 17:13:31 -08:00
Derek Collison
c8a75e1ed0
test fixes
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 16:15:28 -08:00
Derek Collison
76058c5ec6
Timing for state propagation
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 14:32:38 -08:00
Derek Collison
c7c86c7929
Attempt to fix flapper
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 10:25:47 -08:00
Derek Collison
5148bbf898
Fixes based on PR feedback, cleanup
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 10:04:21 -08:00
Derek Collison
7b1e84c086
Fixed raft bug that would cause entries to be missed on restart with leader HB trigger.
...
Also added in creation times to stream and consumer assignments to make them consistent.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-25 08:47:37 -08:00
Derek Collison
117607ef11
Fix for race and test for issue R.I. was seeing in nightly. Also fixed flappers.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-24 21:21:02 -08:00
Derek Collison
a72ddedb55
Fix for issue with stream info and R=1 and fix for a flapper
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-24 19:48:33 -08:00
Derek Collison
9c858d197a
Added ability to properly restore consumers from a snapshot.
...
This made us add forwarding proposals functionality in the raft layer.
More general cleanup and bug fixes as well.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-24 19:30:34 -08:00
Derek Collison
cad0db2aec
Cleanup the consumer assignments when consumers become inactive.
...
This involved extending our raft implementation to forward proposals to the current leader.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-23 13:44:10 -08:00
Derek Collison
d1d2d5b24e
Fix for consumer names list in clustered mode
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-23 10:09:44 -08:00
Derek Collison
d7cfb8f6e9
Use client version for stream and consumer extended info
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-22 13:11:36 -08:00
Derek Collison
a43a69a403
Fix for interest only, broken test
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-22 11:04:06 -08:00
Derek Collison
227901a56b
More cleanup and stabilization for consumers and failing when sending messages.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-22 10:09:30 -08:00
Derek Collison
6f2b50a374
Added support for clustered account info and limit enforcement
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-21 18:47:21 -08:00
Derek Collison
74c06ed046
Shutdown cluster on errors
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-20 11:58:31 -08:00
Derek Collison
ff54c9dc9c
Reworked snapshot and restore.
...
Underestimated the effort to get stream restore working properly in cluster mode.
Some good bug fixes and stability improvments.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-20 11:58:31 -08:00
Derek Collison
a1730f1b31
Report on RAFT group information.
...
This adds in optional reporting to stream and consumer info when running in clsutered mode.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-20 11:58:31 -08:00
Ivan Kozlovic
f5df209022
Fixed SIGSEGV when sending update for unknown stream
...
Will now return an error that the stream is unknown.
Resolves #1827
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2021-01-20 12:42:14 -07:00
Ivan Kozlovic
7d1a4778b8
Merge pull request #1826 from nats-io/fix_consumer_loop_delivery_exit
...
Fix stop of consumer's delivery loop
2021-01-20 10:34:57 -07:00
Ivan Kozlovic
c4a284b58f
Fix stop of consumer's delivery loop
...
I noticed that some consumer go routines were left running at the end
of the test suite.
It turns out that there was a race the way the consumer's qch was closed.
Since it was closed and then set to nil, it is possible that the go
routines that are started and then try to capture o.qch would actually
get qch==nil, wich then when doing a select on that nil channel would
block forever.
So we know pass the qch to the 2 go routines loopAndGatherMsgs() and
loopAndDeliverMsgs() so that when we close the channel there is
no risk of that race happening.
I do believe that there is still something that should be looked at:
it seems that a consumer's delivery loop can now be started/stopped
many times based on leadership acquired/lost. If that is the case,
I think that the consumer should wait for previous go routine to
complete before trying to start new ones.
Also moved 3 JetStream tests to the test/norace_test.go file because
they would consumer several GB of memory when running with the -race flag.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2021-01-19 17:39:32 -07:00
Ivan Kozlovic
42dcdd2eb2
Simplify sendSubsToRoute()
...
Since we were creating subs on the fly, sub.im would always be nil.
We were passing a client because it was needed in sendRouteSubOrUnSubProtos().
This PR simply fills the buffer with each account's subscriptions.
There is also no need to have subs sent from different go routine
based on some threshold. Routes are no longer subject to max pending.
Some code has been made into a function so that they can be shared
by sendSubsToRoute() and sendRouteSubOrUnSubProtos(). The function
is simply adding to given buffer the RS+/- protocol.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2021-01-19 14:01:43 -07:00
Derek Collison
78747b2414
Stability improvements around startup and restore.
...
We were incorrectly starting clustering before enabling accounts and restoring state.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-17 13:44:49 -08:00
Derek Collison
a603f439bb
Make all requests same timeout
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-16 14:17:35 -08:00
Derek Collison
a18a6803c1
Added support for stream and consumer lists.
...
This utilizes a scatter and gather approach.
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-16 12:42:45 -08:00
Derek Collison
cb69df7118
Add proper support for stream update
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-16 06:29:37 -08:00
Derek Collison
b606dceb59
Stabilize restart/catchup for raft.
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-16 05:47:48 -08:00
Ivan Kozlovic
1874964498
Merge pull request #1812 from nats-io/leafnode_fixes
...
Fixed some leafnode issues introduced from JS cluster work
2021-01-15 18:22:02 -07:00
Ivan Kozlovic
0d78bce9cf
Fixed some leafnode issues introduced from JS cluster work
...
Also fixed a flapper.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2021-01-15 12:00:34 -07:00
Ryota
91a1d9a556
Update error message with correct config value
2021-01-15 13:18:31 +00:00
Ivan Kozlovic
6c4229300a
Fixed service import cycle detection that broke with JS clustering
...
Also added some no-op error handler for some tests to silence the
error report in the log.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2021-01-14 11:27:36 -07:00
Derek Collison
1b0e740123
Fix for race
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-14 07:28:14 -08:00
Derek Collison
491b3c34cd
Tweak timing more for tests
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-14 07:07:04 -08:00
Derek Collison
ab2a645791
Fix for various flappers
...
Signed-off-by: Derek Collison <derek@nats.io >
2021-01-14 06:54:08 -08:00