Commit Graph

7975 Commits

Author SHA1 Message Date
Derek Collison
bcf5da04e3 Merge branch 'main' into dev 2023-08-22 06:50:36 -07:00
Derek Collison
90f5371a4c [FIXED] R1 stream move would sometimes lose all msgs. (#4413)
When moving streams, we could check too soon and be in a gap where the
replica peer has not registered a catchup request but had made contact
via the NRG layer.

This would cause us to think the replica was caught up, incorrectly, and
drop our leadership, which would cancel any catchup requests.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-22 06:49:57 -07:00
Derek Collison
e5d208bf33 When moving streams, we could check too soon and be in a gap where the replica peer has not registered a catchup request.
This would cause us to think the replica was caughtup incorrectly and drop our leadership, which would cancel any cacthup requests.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 20:07:48 -07:00
Derek Collison
e088583cd3 Bump to 2.10.0-beta.50
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:59:53 -07:00
Derek Collison
dd4cdfd2fd Specify latest go 1.19 version
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:59:30 -07:00
Derek Collison
f0e2765b44 Fixes for merge conflicts from main
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:55:31 -07:00
Derek Collison
fb8525b713 Merge branch 'main' into dev
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:55:00 -07:00
Derek Collison
2fc3f45ea1 [FIXED] Durable pull consumers could get cleaned up incorrectly on leader change. (#4412)
Fix for a bug that would allow old leaders of pull based durables to
delete a consumer from an inactivity threshold timer inadvertently.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:35:44 -07:00
Derek Collison
6e3ae20650 [FIXED] Fixed deadlock when checkAndSync was being called as part of storing message (#4411)
We violated the locking pattern, so we now make sure we do this in a
separate Go routine and put checks to only run it once.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:28:58 -07:00
Derek Collison
0a86bf4a9a Should reset to false, not true when done
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 14:57:17 -07:00
Derek Collison
43314fd439 Fix for a bug that would allow old leaders of pull based durables to delete a consumer from an inactivity threshold.
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 14:53:09 -07:00
Neil
3377f04b00 Send shutdown event on LDM (#4405)
If we send an event when entering lame duck mode, other nodes will mark
the server as offline immediately, therefore R1 assets will not be
placed onto that node. This is not a problem with R3 or higher because
an LDM server operates as a Raft observer only and therefore cannot take
the leadership role from an election, but R1 assets can in theory be
placed onto any node that is not marked as offline.

A final shutdown event will still be sent when the server actually shuts
down so there is no change there.

Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-21 22:29:30 +01:00
Neil Twigg
d720a6931c Use own subject for LDM event
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-21 22:03:26 +01:00
Neil Twigg
7cc5838a6d Send shutdown event on LDM so that R1 assets do not get assigned to the LDM node
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-21 21:29:01 +01:00
Derek Collison
10f73e888e Remove 1.18 compile build, support 1.19 and above
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 12:12:43 -07:00
Derek Collison
e018705a08 Fixed deadlock when checkAndSync was being called as part of storing message.
We violated the locking pattern, so we now make sure we do this in a separate Go routine and put checks to only run it once.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 12:12:36 -07:00
Waldemar Quevedo
0712628098 Standardize issue forms (#4408) 2023-08-21 06:16:38 -07:00
Neil
4886f1fec4 [IMPROVED] StreamInfo reflecting subject transforms with just a filter and no transformation for Sources (#4403)
- [X] Branch rebased on top of current main (`git pull --rebase origin
main`)
- [X] Changes squashed to a single commit (described
[here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html))
 - [x] Build is green in Travis CI
- [X] You have certified that the contribution is your original work and
that you license the work to the project under the [Apache 2
license](https://github.com/nats-io/nats-server/blob/main/LICENSE)

Adds sfs to sourceInfo such that transforms with just a subject filter
(and no transformation, meaning that the transform pointer in streamInfo
is nil) can still be reflected in SourceInfo, which is important since
the filtering is still happening, just no transformation as well.
2023-08-21 09:44:54 +01:00
Jean-Noël Moyne
62f62d4071 Adds sfs to sourceInfo
Adds sfs to SourceInfo such that transforms with just a subject filter (and no transformation, meaning that the transform pointer in streamInfo is nil) can still be reflected in SourceInfo, which is important since the filtering is still happening, just no transformation as well.

Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-19 12:26:42 -07:00
Byron Ruth
a43075cabc Address typos
Signed-off-by: Byron Ruth <byron@nats.io>
2023-08-19 06:33:19 -04:00
Byron Ruth
ca79ac9a73 Update issue forms
Signed-off-by: Byron Ruth <byron@nats.io>
2023-08-18 21:39:57 -04:00
Neil
ff688ab8ec Tweak consumer replica scaling (#4404)
This should hopefully catch some consumer scaling situations more
reliably, including cases where the consumer filter subjects no longer
match those of the stream after being scaled down to R1 or after a
cluster restart. I've also added a test to test whether filtered
consumers will scale properly even when the stream subject orphans them.

Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-18 09:36:29 +01:00
Neil Twigg
c437157c1f Recover in consumer assignment when asset already existed
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-17 23:22:10 +01:00
Neil Twigg
3c85490dc0 Backport test helper tweak
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-17 15:28:48 +01:00
Neil Twigg
c0636d117f Tweak consumer replica scaling, add unit test for orphaned consumer subjects
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-17 15:27:29 +01:00
Neil
3f28de8e83 Don't set block profile rate (#4402)
We rarely benefit from block profiles and in many cases a mutex profile
will tell us what we need to know. Additionally, setting the block
profile rate to `1` has the special meaning of capturing every single
blocking event, which can have a fairly significant negative impact on
publish performance.

Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-16 19:11:49 +01:00
Neil Twigg
19397a5683 Don't set block profile rate
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-16 17:00:07 +01:00
Neil
7670cf581c Fix stream config update of source transforms (#4400)
- [x] Build is green in Travis CI
- [X] You have certified that the contribution is your original work and
that you license the work to the project under the [Apache 2
license](https://github.com/nats-io/nats-server/blob/main/LICENSE)

Fixes potential out of range access during some stream source transform
configuration updates and tiny clean up
Fixes stream sourcing message header parsing for multi-subject transform
in sources
2023-08-16 09:39:12 +01:00
Jean-Noël Moyne
0cc43acb84 Fix Nats-Stream-Source header parsing when using multi-filter transforms
Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-15 19:22:09 -07:00
Jean-Noël Moyne
c2d3ef1021 Fix potential out of range for stream source transform update.
Clean up un-needed if statement as it's ok to call NewSubjectTransform with an empty destination (ie no transformation) it will return nil

Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-15 16:35:19 -07:00
Neil
c2d1e6d051 Add some jitter to leafnode remotes reconnect (#4398)
This adds a jitter delay based on the reconnect delay for when a remote
reconnects.
2023-08-15 17:44:40 +01:00
Waldemar Quevedo
740e5ddc37 Add some jitter to leafnode remotes reconnect
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-15 07:36:37 -07:00
Waldemar Quevedo
1e87c3d820 config: allow empty configs, but prevent bad configs (#4394)
Allows again empty configs and improves support for detecting some types
of invalid configs:

- Adds reporting the line with the bad key position that makes the
config invalid.

```
nats-server: config is invalid (foo.conf:1:2)
nats-server: error parsing include file 'included.conf', config is invalid (included.conf:2:2)
```

- Fixes a few tests with trailing braces which were being handled as
keys and ignored before.
2023-08-15 06:11:17 -07:00
Neil
8717b050e9 [IMPROVED] $SYS.REQ.SERVER.PING.PROFILEZ always honored (#4393) 2023-08-14 22:03:46 +01:00
Jean-Noël Moyne
61a0555336 Call SetBlockProfileRate even it the profiling port is not set
Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-14 10:58:20 -07:00
R.I.Pienaar
3cc3037c9b Missing json tag (#4395)
Minor update to server info structures
2023-08-14 20:42:58 +03:00
R.I.Pienaar
1d916ef9c7 Adds a missing json encoding tag
Signed-off-by: R.I.Pienaar <rip@devco.net>
2023-08-14 17:41:02 +03:00
Waldemar Quevedo
3a20f66535 config: parsed empty config only show warnings
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-13 23:59:50 -07:00
Waldemar Quevedo
412dee67f1 config: allow empty configs, but prevent bad configs
- Adds reporting the line with the bad key position
  that makes the config invalid.

- Fixes a few tests with trailing braces which were
  being handled as keys and ignored before.

Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-13 23:13:32 -07:00
Jean-Noël Moyne
40b8aa434b Remove part of the test that expects an error since now you can always get the profilez through the system account request
Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-13 18:00:08 -07:00
Jean-Noël Moyne
b839c53abc [ADDED] Full StreamSource (filters, transforms) functionality to stream mirror (#4354)
- [X] Tests added
- [X] Branch rebased on top of current main (`git pull --rebase origin
main`)
- [X] Changes squashed to a single commit (described
[here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html))
 - [x] Build is green in Travis CI
- [X] You have certified that the contribution is your original work and
that you license the work to the project under the [Apache 2
license](https://github.com/nats-io/nats-server/blob/main/LICENSE)

Follow up to #4276 extending to Mirror the full StreamSource
functionality.

---------

Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-12 15:17:48 -07:00
Jean-Noël Moyne
bb53b54810 Remove the gate on a profiling port being defined in the server config for the profilez request to return profiling data even if the server doesn't have a profiling port set.
Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-12 14:48:38 -07:00
Neil
39eabb4f0a When checking replica count when updating retention, make sure stream assignment is set (#4391)
This should fix a panic found by @scottf:
```
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x68 pc=0x10ce982]
goroutine 51 [running]:
github.com/nats-io/nats-server/v2/server.(*stream).updateWithAdvisory(0xc0000c6380, 0xc000510156?, 0x1)
        C:/nats/temp/nats-server/server/stream.go:1793 +0xa82
github.com/nats-io/nats-server/v2/server.(*stream).update(...)
        C:/nats/temp/nats-server/server/stream.go:1609
github.com/nats-io/nats-server/v2/server.(*Server).jsStreamUpdateRequest(0xc000184d80, 0x4d00000000000000?, 0xc0001d4c80, 0x64d6330f?, {0xc000510140, 0x1c}, {0xc00041c150, 0x11}, {0xc0003cc240, 0x100, ...})
        C:/nats/temp/nats-server/server/jetstream_api.go:1460 +0xbf2
github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch(0xc0001e6000, 0xc0001e2a80, 0xc0001d4c80, 0xc0001c2280, {0xc000510140, 0x1c}, {0xc00041c150, 0x11}, {0xc0003cc240, 0x100, ...})
        C:/nats/temp/nats-server/server/jetstream_api.go:768 +0x26a
github.com/nats-io/nats-server/v2/server.(*client).deliverMsg(0xc0001d4c80, 0x0, 0xc0001e2a80, 0x30?, {0xc000510120, 0x1c, 0x20}, {0xc00041c138, 0x11, 0x18}, ...)
        C:/nats/temp/nats-server/server/client.go:3421 +0xabe
github.com/nats-io/nats-server/v2/server.(*client).processMsgResults(0xc0001d4c80, 0xc0001c2280, 0xc0001ef3e0, {0xc0003cc240, 0x102, 0x120}, {0x0, 0x0, 0x243fb484c00?}, {0xc000510120, ...}, ...)
        C:/nats/temp/nats-server/server/client.go:4473 +0xb12
github.com/nats-io/nats-server/v2/server.(*client).processServiceImport(0xc0001d4c80, 0xc00015c480, 0xc0001c2000, {0xc00008255b, 0x83, 0xa5})
        C:/nats/temp/nats-server/server/client.go:4258 +0x11be
github.com/nats-io/nats-server/v2/server.(*Account).addServiceImportSub.func1(0xc0004a3018?, 0xb02705?, 0x0?, {0x1?, 0x0?}, {0x0?, 0x0?}, {0xc00008255b, 0x83, 0xa5})
        C:/nats/temp/nats-server/server/accounts.go:1993 +0x32
github.com/nats-io/nats-server/v2/server.(*client).deliverMsg(0xc0001d4c80, 0x0, 0xc0001f0000, 0x3100000020?, {0xc000082504, 0x1c, 0xfc}, {0xc000082521, 0x34, 0xdf}, ...)
        C:/nats/temp/nats-server/server/client.go:3419 +0xb69
github.com/nats-io/nats-server/v2/server.(*client).processMsgResults(0xc0001d4c80, 0xc0001c2000, 0xc0001ef050, {0xc00008255b, 0x83, 0xa5}, {0x0, 0x0, 0xc00006ab40?}, {0xc000082504, ...}, ...)
        C:/nats/temp/nats-server/server/client.go:4473 +0xb12
github.com/nats-io/nats-server/v2/server.(*client).processInboundClientMsg(0xc0001d4c80, {0xc00008255b, 0x83, 0xa5})
        C:/nats/temp/nats-server/server/client.go:3893 +0xc8c
github.com/nats-io/nats-server/v2/server.(*client).processInboundMsg(0xc0001d4c80?, {0xc00008255b?, 0x83?, 0xa5?})
        C:/nats/temp/nats-server/server/client.go:3732 +0x3d
github.com/nats-io/nats-server/v2/server.(*client).parse(0xc0001d4c80, {0xc000082500, 0xde, 0x100})
        C:/nats/temp/nats-server/server/parser.go:497 +0x210a
github.com/nats-io/nats-server/v2/server.(*client).readLoop(0xc0001d4c80, {0x0, 0x0, 0x0})
        C:/nats/temp/nats-server/server/client.go:1373 +0x1305
github.com/nats-io/nats-server/v2/server.(*Server).createClientEx.func1()
        C:/nats/temp/nats-server/server/server.go:3130 +0x29
github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine.func1()
        C:/nats/temp/nats-server/server/server.go:3607 +0x1bd
created by github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine
        C:/nats/temp/nats-server/server/server.go:3603 +0x265
```

Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-11 15:09:16 +01:00
Neil Twigg
3c9c124b94 When checking replica count when updating retention, make sure stream assignment is set first
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-11 14:15:49 +01:00
Neil
d474e3b725 [ADDED] $SYS server request to 'kick' or 'LDM' a client connection (#4298)
- [X] Link to issue, e.g. `Resolves #NNN`
- [X] Branch rebased on top of current main (`git pull --rebase origin
main`)
- [ ] Changes squashed to a single commit (described
[here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html))
 - [x] Build is green in Travis CI
- [X] You have certified that the contribution is your original work and
that you license the work to the project under the [Apache 2
license](https://github.com/nats-io/nats-server/blob/main/LICENSE)

Resolves #1556

### Changes proposed in this pull request:

Adds tw new $SYS server API endpoints:

- `$SYS.REQ.SERVER.%s.KICK` (where %s is the server_id) which 'kicks'
(effectiveley 'rebalance' as the client application reconnects itself
right away (potentially to another server in the cluster)). The service
takes a JSON payload containing either an "id" or a "name" field. "id"
disconnects the client connection id, "name" disconnects _all_ of the
clients connected to the server with that name.

- `$SYS.REQ.SERVER.%s.LDM` (where %s is the server_id) and takes a JSON
payload containing either an "id" or a "name" field. "id" sends an LDM
Info message to the client connection id, "name" sends an LDM Info
message to _all_ of the clients connected to the server with that name.

This features allow administrators to manually 're-balance' client
connections between the servers in the cluster (e.g. after a rolling
upgrade of the servers where one server ends up with no client
connections after the upgrade), by kicking some of the client
connections from one of the 'overloaded' (in comparison to other
servers) servers in the cluster, causing them to re-estalibsh their
connection to (hopefully) another server.
2023-08-11 09:39:42 +01:00
Jean-Noël Moyne
fc41ab1a5a Adds LDM and KICK server $SYS requests
Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-10 17:08:09 -07:00
Waldemar Quevedo
37d3220dfb test: fixes for TestLeafNodeSlowConsumer (#4388)
It would fail sometimes locally otherwise...
```
=== RUN   TestLeafNodeSlowConsumer
    leafnode_test.go:7069: got: 0, expected: 1
--- FAIL: TestLeafNodeSlowConsumer (0.29s)
=== RUN   TestLeafNodeSlowConsumer
    leafnode_test.go:7069: got: 0, expected: 1
--- FAIL: TestLeafNodeSlowConsumer (0.28s)
=== RUN   TestLeafNodeSlowConsumer
--- PASS: TestLeafNodeSlowConsumer (0.28s)
=== RUN   TestLeafNodeSlowConsumer
    leafnode_test.go:7069: got: 0, expected: 1
--- FAIL: TestLeafNodeSlowConsumer (0.28s)
=== RUN   TestLeafNodeSlowConsumer
--- PASS: TestLeafNodeSlowConsumer (0.28s)
```
2023-08-10 01:12:21 -07:00
Waldemar Quevedo
9e1e92e325 test: update TestWSTLSVerifyClientCert for go1.21 (#4387)
Similar to #4380 , this TLS error message changed in Go 1.21.
2023-08-10 01:05:38 -07:00
Waldemar Quevedo
f16582e2a4 test: update TestWSTLSVerifyClientCert for go1.21
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-09 21:50:46 -07:00
Waldemar Quevedo
7c9ea91296 test: fix TestLeafNodeSlowConsumer flake
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-09 21:35:24 -07:00