Commit Graph

5672 Commits

Author SHA1 Message Date
Derek Collison
31fc00a6e5 Merge pull request #3039 from nats-io/delete-create
Make sure if we recreate something after deleting that we do not wipe valid state
2022-04-15 12:57:49 -07:00
Derek Collison
10c877d942 Make sure if we recreate something after deleting that we do not wipe valid state
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-15 12:22:10 -07:00
Ivan Kozlovic
a6b62f61a7 Fix test that should have been fixed following FC tweak
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-14 18:06:25 -06:00
Ivan Kozlovic
0e841d4acf Tweak ordered consumer flow control and bump to beta.18
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-14 17:43:43 -06:00
Ivan Kozlovic
09609a4d63 Bump to 2.8.0-beta.17
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-14 11:28:40 -06:00
Ivan Kozlovic
c52d65bbed Merge pull request #3035 from nats-io/js_wait_routing
[IMPROVED] JetStream: reduce unnecessary leader election
2022-04-14 11:27:39 -06:00
Ivan Kozlovic
4e7c72ab33 Update based on code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-14 11:00:33 -06:00
Ivan Kozlovic
bd61d51a1c [IMPROVED] JetStream: reduce unnecessary leader election
- Wait of some sort of routing to be in place before starting
the raft run loop
- Remove use of lock in apiDispatch that was not necessary but
could have cause a route to block, causing memory growth, etc..

Unrelated rename of some tests so that they start with TestJetStream
and TestJetStreamCluster for cluster tests, fixed some flappers
and ensure that tests that change RAFT timeouts put them back
to default values on exit.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-14 10:47:14 -06:00
Derek Collison
0df5da3924 Merge pull request #3036 from nats-io/move_improvements
Improvements to stream and consumer move.
2022-04-14 09:40:45 -07:00
Derek Collison
9748925f13 Improvements to stream and consumer move.
During elected stepdown and transfer allow the new leader to take over before we stepdown.
We could receive a leader change, so make sure to also check migration state.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-14 07:27:29 -07:00
Matthias Hanel
ec3f9258af [Adding] max_ha_assets to limit placement on server with more ha assets (#3032)
* [Adding] max_ha_assets to limit placement on server with more ha assets

server running more than max_ha_assets #raft nodes will not be used to
place new streams and fail if not enough free server can be found.
Durable Consumer creation on such server will fail as their peer size is
bound to the same set as their stream.

This also avoids updating placement where no new placement is needed.
This is the case when, on update, placement tags get removed. 

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-14 01:53:41 -04:00
Jaime Piña
0dabed2ea3 Re-enable placement tests (#3034) 2022-04-13 13:44:24 -07:00
Derek Collison
1c0112a476 Bump to 2.8.0-beta.16
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-13 12:32:55 -07:00
Derek Collison
07c14e4dc6 Merge pull request #3033 from nats-io/catchup_fc
Bump catchup flow control max outstanding bytes
2022-04-13 12:29:08 -07:00
Derek Collison
cd8aeab4ea Bump catchup flow control max outstanding bytes
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-13 12:05:21 -07:00
Ivan Kozlovic
08d1507c50 Merge pull request #3031 from nats-io/fix_3024
[FIXED] LeafNode interest propagation with imports/exports
2022-04-13 13:00:51 -06:00
Ivan Kozlovic
c1a17e890a Fixed JetStream flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-13 09:55:24 -06:00
Ivan Kozlovic
c92dc0dc5b [FIXED] LeafNode interest propagation with imports/exports
When using subscriptions through import/exports, the server with
a leafnode connection would properly send the interest over, but
if the connection is recreated, this would not happen.

In case of JetStream where that happens under the cover, message
flow would stop after the leafnode restart because the subscriptions
would be created on recovery of the JetStream assets but *before*
the LeafNode connection could be established.

Resolves #3024
Resolves #3027
Resolves #3009

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-13 09:55:24 -06:00
Ivan Kozlovic
e06e0a247f Merge pull request #3030 from nats-io/js_data_race
Fixed data race with RAFT node election timer
2022-04-12 19:04:44 -06:00
Ivan Kozlovic
1ba617bba0 Fixed data race with RAFT node election timer
Got this race:
```
==================
WARNING: DATA RACE
Read at 0x00c001c880e8 by goroutine 342:
  github.com/nats-io/nats-server/v2/server.(*raft).resetElect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1525 +0x44
  github.com/nats-io/nats-server/v2/server.(*raft).resetElectionTimeout()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1520 +0xa4
  github.com/nats-io/nats-server/v2/server.(*raft).handleAppendEntry()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:2537 +0x12e
  github.com/nats-io/nats-server/v2/server.(*raft).handleAppendEntry-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:2525 +0xcc
...

Previous write at 0x00c001c880e8 by goroutine 587:
  github.com/nats-io/nats-server/v2/server.(*raft).resetElect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1526 +0x113
  github.com/nats-io/nats-server/v2/server.(*raft).resetElectionTimeout()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1520 +0xa4
  github.com/nats-io/nats-server/v2/server.(*Server).startRaftNode()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:484 +0x20d1
  github.com/nats-io/nats-server/v2/server.(*jetStream).createRaftGroup()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:1497 +0x9ed
  github.com/nats-io/nats-server/v2/server.(*jetStream).processClusterCreateConsumer()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:3063 +0xba4
...

==================
WARNING: DATA RACE
Read at 0x00c0006671f0 by goroutine 342:
  time.(*Timer).Stop()
      /usr/local/go/src/time/sleep.go:78 +0x84
  github.com/nats-io/nats-server/v2/server.(*raft).resetElect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1528 +0x58
  github.com/nats-io/nats-server/v2/server.(*raft).resetElectionTimeout()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1520 +0xa4
  github.com/nats-io/nats-server/v2/server.(*raft).handleAppendEntry()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:2537 +0x12e
  github.com/nats-io/nats-server/v2/server.(*raft).handleAppendEntry-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:2525 +0xcc
...

Previous write at 0x00c0006671f0 by goroutine 587:
  time.NewTimer()
      /usr/local/go/src/time/sleep.go:92 +0xb3
  github.com/nats-io/nats-server/v2/server.(*raft).resetElect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1526 +0x104
  github.com/nats-io/nats-server/v2/server.(*raft).resetElectionTimeout()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:1520 +0xa4
  github.com/nats-io/nats-server/v2/server.(*Server).startRaftNode()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/raft.go:484 +0x20d1
  github.com/nats-io/nats-server/v2/server.(*jetStream).createRaftGroup()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:1497 +0x9ed
...
```

Looked at all places where resetElect() or resetElectionTimeout() was invoked without
being protected by the raft's lock and added it.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 18:56:28 -06:00
Ivan Kozlovic
37a3403585 Bump to version 2.8.0-beta.15
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 17:50:21 -06:00
Ivan Kozlovic
3e7f60b52b Merge pull request #3029 from nats-io/js_catchup
[FIXED] JetStream stream catchup issues and deadlock
2022-04-12 17:49:42 -06:00
Derek Collison
3c0bced76e Move test to no race, rename others
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-12 16:23:36 -07:00
Derek Collison
3bd8ee845e Fix description for Wipe
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-12 16:18:38 -07:00
Derek Collison
04db6b0935 Only wipe on certain errors and always resume
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-12 15:50:37 -07:00
Ivan Kozlovic
50c3986863 [FIXED] JetStream stream catchup issues
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 16:05:12 -06:00
Derek Collison
5dfcc5e934 Fix for flapping WAL test
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 22:50:25 -07:00
Derek Collison
ce650937f0 Don't set domain here
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 20:52:22 -07:00
Matthias Hanel
0f113aa3d5 [FIXED] subject renaming with hand crafted reply subject (#3026)
do so by rejecting jsackprefix in reply subjects

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-11 22:32:02 -04:00
Derek Collison
fe5a3f584a Merge pull request #3023 from nats-io/stream_alts
Stream Alternates
2022-04-11 18:13:05 -07:00
Derek Collison
aa256de55b Add in Domain to alternates
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 18:47:19 -06:00
Derek Collison
b7718e2b7a First pass support for stream alternates
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 18:47:19 -06:00
Derek Collison
0979c9f720 Bump to 2.8.0-beta.14
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 17:41:57 -07:00
Derek Collison
04cce6df68 Merge pull request #3020 from nats-io/move-updates
[IMPROVED] Raft layer for general stability and leader election.
2022-04-11 17:33:13 -07:00
Matthias Hanel
02d25cc640 [FIXED] Consumer deliver subject incorrect when imported and crossing gateway (#3025)
followup to #3017

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-11 20:27:25 -04:00
Derek Collison
e330572cef Select next leader before truncating
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 17:04:29 -07:00
Derek Collison
3ed1ecc032 Remove old code
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 12:00:29 -07:00
Jaime Piña
cfa55281ec Refactor SystemLimitsPlacement tests (#3014) 2022-04-11 11:41:38 -07:00
Matthias Hanel
13e5ab10bd fix js nex interest check where leaf node masked gw subj propagation (#3016)
basically a gw subject propagation issue could be hidden behind a leaf
node.
also change error text when this was the case

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-11 14:04:09 -04:00
Derek Collison
95f3a3f919 Resolved conflicts with main
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 06:24:47 -07:00
Derek Collison
c3612b57c7 Fixes for some flapping tests
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 13:02:03 -07:00
Derek Collison
37cbac99e7 Improvements to the raft layer for general stability and support of scale up and down and asset move.
Also fixed a bug that would allow a leadership transfer when catching up.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:59:39 -07:00
Derek Collison
efb91c4ade Upgrade to latest released client
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:03:11 -07:00
Derek Collison
e7ff38a4ca Add consumerMemStore impl to allow proper replication of state.
Resolves #3006

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:01:13 -07:00
Derek Collison
4f83736265 Merge pull request #3017 from nats-io/consumer_subj_rewrite
[FIXED] Consumer deliver subject incorrect when imported and crossing a route.
2022-04-09 12:26:27 -07:00
Derek Collison
2510d671de Skip flapper for now, will fix in separate PR
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-09 11:55:04 -07:00
Derek Collison
cd7f16f28a Tweak timing for test to prevent flapping
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-09 11:13:49 -07:00
Derek Collison
331c2faaa6 When using a stream import for a push consumer's messages, if the message crossed a route we dropped the delivered subject.
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-09 06:42:22 -07:00
Derek Collison
3663d595fc Disallow moving a stream that is already being moved
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-07 17:09:55 -07:00
Matthias Hanel
5662141932 Adding unique_tag to ensure matching tags are not used twice (#3011)
Allows to not place a stream in the same availability zone twice.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-07 18:11:00 -04:00