nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-14 10:10:42 -07:00

Author	SHA1	Message	Date
Derek Collison	ed3f8be0c5	Bump version 2.10.0-beta.36 Signed-off-by: Derek Collison <derek@nats.io>	2023-05-06 18:49:13 -07:00
Derek Collison	18244ea8cb	Fix test that did not set ack policy to explicit Signed-off-by: Derek Collison <derek@nats.io>	2023-05-06 15:10:46 -07:00
Derek Collison	caa262513d	Fix test that did not set ack policy which is needed Signed-off-by: Derek Collison <derek@nats.io>	2023-05-06 14:15:44 -07:00
Derek Collison	dbff40f2b6	Adopt same update from main Signed-off-by: Derek Collison <derek@nats.io>	2023-05-06 09:56:01 -07:00
Derek Collison	4175e4ee9c	Merge branch 'main' into dev	2023-05-06 09:55:34 -07:00
Derek Collison	76f4358349	[IMPROVED] Optimizations for large single hub account leafnode fleets. (#4135 ) Added a leafnode lock to allow better traversal without copying of large leafnodes in a single hub account. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-06 09:53:08 -07:00
Derek Collison	80db7a22ab	Optimizations for large single hub account leafnode fleets. Added a leafnode lock to allow better traversal without copying of large leafnodes in a single hub account. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-05 13:14:49 -07:00
Waldemar Quevedo	b886fed2fb	Stop using UTC for time for flushClients In #1943 it was adopted to use `UTC()` in some timestamps, but an unintended side effect from this is that it strips the monotonic time, so it can be prone to clock skews when subtracting time in other areas of the code. `e5646b23de`	2023-05-04 15:50:45 -07:00
Tomasz Pietrek	69fb3db0f5	Optimize consumer messages sequences for multiple subjects (#4129 ) If consumer with multiple subjects encountered a sequnece of messages in a row from the same subject, it tried to load messages from other subjects in some cases. This checks for that scenario and optimizes it by early returning. I added a temporary instrumentation to check for how many times fetching new messages is called, and it seems that it cuts those calls according to assumptions. Though it being internal, it's really hard to show that in test. Signed-off-by: Tomasz Pietrek <tomasz@nats.io>	2023-05-04 20:13:13 +02:00
Tomasz Pietrek	7c1c4ea5fb	Optimize consumer messages sequences for multiple subjects If consumer with multiple subjects encountered a sequnece of messages from the same subject, it tried to load messages from other subjects in some cases. This checks for that scenario and optimizes it by early returning. Signed-off-by: Tomasz Pietrek <tomasz@nats.io>	2023-05-04 16:02:19 +02:00
Derek Collison	9fa724cd7b	Merge branch 'main' into dev	2023-05-03 21:00:35 -07:00
Derek Collison	da8aeac91b	Fix flapper Signed-off-by: Derek Collison <derek@nats.io>	2023-05-03 21:00:17 -07:00
Derek Collison	68f6b59fc7	Merge branch 'main' into dev	2023-05-03 19:51:24 -07:00
Derek Collison	ae73e6a573	Bump to 2.9.17-beta.5 Signed-off-by: Derek Collison <derek@nats.io>	2023-05-03 19:50:21 -07:00
Derek Collison	21239022bd	Protect against usage drift for any unforseen reason and if detected correct. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-03 17:09:06 -07:00
Ivan Kozlovic	311e3feb5f	Merge branch 'main' into dev	2023-05-03 17:38:40 -06:00
Ivan Kozlovic	8a4ead22bc	Updates based on code review Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-03 16:14:51 -06:00
Ivan Kozlovic	7afe76caf8	Fixed Sublist.RemoveBatch to remove subs present, even if one isn't I have seen cases, maybe due to previous issue with configuration reload that would miss subscriptions in the sublist because of the sublist swap, where we would attempt to remove subscriptions by batch but some were not present. I would have expected that all present subscriptions would still be removed, even if the call overall returned an error. This is now fixed and a test has been added demonstrating that even on error, we remove all subscriptions that were present. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-03 15:21:26 -06:00
Ivan Kozlovic	95e4f2dfe1	Fixed accounts configuration reload Issues could manifest with subscription interest not properly propagated. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-03 14:35:06 -06:00
Ivan Kozlovic	840c264f45	Cleanup use of s.opts and fixed some lock (deadlock/inversion) issues One should not access s.opts directly but instead use s.getOpts(). Also, server lock needs to be released when performing an account lookup (since this may result in server lock being acquired). A function was calling s.LookupAccount under the client lock, which technically creates a lock inversion situation. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-03 14:09:02 -06:00
Derek Collison	b61e411b44	Fix race in reload and gateway sublist check (#4127 ) Fixes the following race: during reload account sublist can be changed: `2699465596/server/reload.go (L1598-L1610)` so this can become a race while checking interest in the gateway code here: `79de3302be/server/gateway.go (L2683)` ``` === RUN TestJetStreamSuperClusterPeerReassign ================== WARNING: DATA RACE Write at 0x00c0010854f0 by goroutine 15595: github.com/nats-io/nats-server/v2/server.(Server).reloadAuthorization.func2() /home/travis/gopath/src/github.com/nats-io/nats-server/server/reload.go:1610 +0x486 sync.(Map).Range() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/sync/map.go:354 +0x225 github.com/nats-io/nats-server/v2/server.(Server).reloadAuthorization() /home/travis/gopath/src/github.com/nats-io/nats-server/server/reload.go:1594 +0x35d github.com/nats-io/nats-server/v2/server.(Server).applyOptions() /home/travis/gopath/src/github.com/nats-io/nats-server/server/reload.go:1454 +0xf4 github.com/nats-io/nats-server/v2/server.(Server).reloadOptions() /home/travis/gopath/src/github.com/nats-io/nats-server/server/reload.go:908 +0x204 github.com/nats-io/nats-server/v2/server.(Server).ReloadOptions() /home/travis/gopath/src/github.com/nats-io/nats-server/server/reload.go:847 +0x4a4 github.com/nats-io/nats-server/v2/server.(Server).Reload() /home/travis/gopath/src/github.com/nats-io/nats-server/server/reload.go:782 +0x125 github.com/nats-io/nats-server/v2/server.(cluster).removeJetStream() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_helpers_test.go:1498 +0x310 github.com/nats-io/nats-server/v2/server.TestJetStreamSuperClusterPeerReassign() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_super_cluster_test.go:395 +0xa38 testing.tRunner() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/testing/testing.go:1446 +0x216 testing.(T).Run.func1() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/testing/testing.go:1493 +0x47 Previous read at 0x00c0010854f0 by goroutine 15875: github.com/nats-io/nats-server/v2/server.(Server).gatewayHandleSubjectNoInterest() /home/travis/gopath/src/github.com/nats-io/nats-server/server/gateway.go:2683 +0x12d github.com/nats-io/nats-server/v2/server.(client).processInboundGatewayMsg() /home/travis/gopath/src/github.com/nats-io/nats-server/server/gateway.go:2980 +0x595 github.com/nats-io/nats-server/v2/server.(client).processInboundMsg() /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3532 +0xc7 github.com/nats-io/nats-server/v2/server.(client).parse() /home/travis/gopath/src/github.com/nats-io/nats-server/server/parser.go:497 +0x34f9 github.com/nats-io/nats-server/v2/server.(client).readLoop() /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1284 +0x17e8 github.com/nats-io/nats-server/v2/server.(Server).createGateway.func1() /home/travis/gopath/src/github.com/nats-io/nats-server/server/gateway.go:858 +0x37 Goroutine 15595 (running) created at: testing.(T).Run() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/testing/testing.go:1493 +0x75d testing.runTests.func1() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/testing/testing.go:1846 +0x99 testing.tRunner() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/testing/testing.go:1446 +0x216 testing.runTests() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/testing/testing.go:1844 +0x7ec testing.(M).Run() /home/travis/.gimme/versions/go1.19.8.linux.amd64/src/testing/testing.go:1726 +0xa84 github.com/nats-io/nats-server/v2/server.TestMain() /home/travis/gopath/src/github.com/nats-io/nats-server/server/sublist_test.go:1577 +0x292 main.main() _testmain.go:3615 +0x324 Goroutine 15875 (running) created at: github.com/nats-io/nats-server/v2/server.(Server).startGoRoutine() /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3098 +0x88 github.com/nats-io/nats-server/v2/server.(Server).createGateway() /home/travis/gopath/src/github.com/nats-io/nats-server/server/gateway.go:858 +0xfc4 github.com/nats-io/nats-server/v2/server.(Server).startGatewayAcceptLoop.func1() /home/travis/gopath/src/github.com/nats-io/nats-server/server/gateway.go:553 +0x48 github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections.func1() /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:2184 +0x58 ================== testing.go:1319: race detected during execution of test --- FAIL: TestJetStreamSuperClusterPeerReassign (2.08s) ```	2023-05-02 18:12:56 -07:00
Waldemar Quevedo	938ffcba20	Fix race in reload and gateway sublist check Signed-off-by: Waldemar Quevedo <wally@nats.io>	2023-05-02 17:51:53 -07:00
Derek Collison	ae73f7be55	Small raft improvements. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-02 16:44:27 -07:00
Derek Collison	e7b01c4154	Merge branch 'main' into dev	2023-05-02 16:30:00 -07:00
Derek Collison	9ef71893db	Bump to 2.9.17-beta.4 Signed-off-by: Derek Collison <derek@nats.io>	2023-05-02 09:43:11 -07:00
Derek Collison	4a58feff27	When removing a msg and we need to load the msg block and incur IO, unlock fs lock to avoid stalling other activity on other blocks. E.g removing and adding msgs at the same time. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-02 08:56:43 -07:00
Derek Collison	eb1eb3c49e	Merge branch 'main' into dev	2023-05-01 16:29:35 -07:00
Derek Collison	f098c253aa	Make sure we adjust accounting reservations when deleting a stream with any issues. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-01 15:54:37 -07:00
Ivan Kozlovic	0a02f2121c	[ADDED] LeafNode: TLSHandhsakeFirst option A new field in `tls{}` blocks force the server to do TLS handshake before sending the INFO protocol. ``` leafnodes { port: 7422 tls { cert_file: ... ... handshake_first: true } remotes [ { url: tls://host:7423 tls { ... handshake_first: true } } ] } ``` Note that if `handshake_first` is set in the "accept" side, the first `tls{}` block in the example above, a server trying to create a LeafNode connection to this server would need to have `handshake_first` set to true inside the `tls{}` block of the corresponding remote. Configuration reload of leafnodes is generally not supported, but TLS certificates can be reloaded and the support for this new field was also added. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-01 16:41:51 -06:00
Derek Collison	f5ac5a4da0	Fix for a bug that could leave a raft node running when stopping a stream. This can happen when we reset a stream internally and the stream had a prior snapshot. Also make sure to always release resources back to the account regardless if the store is no longer present. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-01 13:22:06 -07:00
Derek Collison	1eed0e8c75	Bump to 2.9.17-beta.3 Signed-off-by: Derek Collison <derek@nats.io>	2023-04-30 17:43:59 -07:00
Derek Collison	e158c46884	Merge branch 'main' into dev	2023-04-30 17:37:47 -07:00
Derek Collison	c15cc0054a	When a fleet of leafnodes are isolated (not routed but using same cluster) we could do better at optimizing how we update the other leafnodes. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-30 17:08:16 -07:00
Derek Collison	0321eb6484	Merge branch 'main' into dev	2023-04-29 19:52:57 -07:00
Derek Collison	b27ce6de80	Add in a few more places to check on jetstream shutting down. Add in a helper method. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-29 11:27:18 -07:00
Derek Collison	db972048ce	Detect when we are shutting down or if a consumer is already closed when removing a stream. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-29 11:18:10 -07:00
Derek Collison	4eb4e5496b	Do health check on startup once we have processed existing state. Also do health checks in separate go routine. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-29 09:36:35 -07:00
Derek Collison	fac5658966	If we fail to create a consumer, make sure to clean up any raft nodes in meta layer and to shutdown the consumer if created but we encountered an error. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-29 08:15:33 -07:00
Derek Collison	546dd0c9ab	Make sure we can recover an underlying node being stopped. Do not return healthy if the node is closed, and wait a bit longer for forward progress. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-29 07:42:23 -07:00
Derek Collison	85f6bfb2ac	Check healthz periodically Signed-off-by: Derek Collison <derek@nats.io>	2023-04-28 17:58:45 -07:00
Derek Collison	ac27fd046a	Fix data race Signed-off-by: Derek Collison <derek@nats.io>	2023-04-28 17:57:03 -07:00
Derek Collison	d107ba3549	Under certain scenarios we have witnessed healthz() that never retrun healthy due to a stream or consumer being missing or stopped. This will now allow the healthy call to attempt to restart those assets. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-28 17:11:08 -07:00
Ivan Kozlovic	349f01e86a	Change the absence of compression setting to default to "accept" In that mode, a server accepts and will switch to same compression level than the remote (if one is set) but will not initiate compression. So if all servers in a cluster do not have compression setting set, it defaults to "accept" which means that compression is "off". Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-28 15:33:17 -06:00
Ivan Kozlovic	5b8c9ee364	Changes based on code review Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-28 14:34:32 -06:00
Ivan Kozlovic	70af04a63f	Other flappers. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-28 11:22:04 -06:00
Ivan Kozlovic	73ed55ae5b	Fixed flapper Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-28 10:55:32 -06:00
Ivan Kozlovic	8d2683a062	Fixed data race Reverts changes made in PR#4001: `105237cba8 (diff-1322a81c43dfdd05284ae128c43d9ea51c1a3b677587686561ef6de47024e14aR1340)` Since a fix was made here: `b78ec39b1f` the changes made in PR need to be reverted. The test TestRoutePoolAndPerAccountWithServiceLatencyNoDataRace now passes. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-28 10:18:14 -06:00
Ivan Kozlovic	d6fe9d4c2d	[ADDED] Support for route S2 compression The new field `compression` in the `cluster{}` block allows to specify which compression mode to use between servers. It can be simply specified as a boolean or a string for the simple modes, or as an object for the "s2_auto" mode where a list of RTT thresholds can be specified. By default, if no compression field is specified, the server will use the s2_auto mode with default RTT thresholds of 10ms, 50ms and 100ms for the "uncompressed", "fast", "better" and "best" modes. ``` cluster { .. # Possible values are "disabled", "off", "enabled", "on", # "accept", "s2_fast", "s2_better", "s2_best" or "s2_auto" compression: s2_fast } ``` To specify a different list of thresholds for the s2_auto, here is how it would look like: ``` cluster { .. compression: { mode: s2_auto # This means that for RTT up to 5ms (included), then # the compression level will be "uncompressed", then # from 5ms+ to 15ms, the mode will switch to "s2_fast", # then from 15ms+ to 50ms, the level will switch to # "s2_better", and anything above 50ms will result # in the "s2_best" compression mode. rtt_thresholds: [5ms, 15ms, 50ms] } } ``` Note that the "accept" mode means that a server will accept compression from a remote and switch to that same compression mode, but will otherwise not initiate compression. That is, if 2 servers are configured with "accept", then compression will actually be "off". If one of the server had say s2_fast then they would both use this mode. If a server has compression mode set (other than "off") but connects to an older server, there will be no compression between those 2 routes. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-27 17:59:25 -06:00
Marco Primi	82eade93b4	Merge JS Chaos tests into a single file	2023-04-27 14:56:55 -07:00
Marco Primi	7908d8c05c	Merge JS benchmarks into a single file	2023-04-27 14:56:55 -07:00

1 2 3 4 5 ...

5068 Commits