nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-02 03:38:42 -07:00

Author	SHA1	Message	Date
Derek Collison	2737c56352	Only setup auto no-auth for $G account iff no authorization block was defined. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-28 13:51:45 -07:00
Phil Pennock	259e904401	Merge systemd: use SIGUSR2 for shutdown, for LDM (#4603 )	2023-09-28 14:26:09 -04:00
Derek Collison	783edaa36d	[FIXED] Race condition in some leader failover scenarios leading to messages being potentially sourced more than once. (#4592 ) - [X] Tests added - [X] Branch rebased on top of current main (`git pull --rebase origin main`) - [X] Changes squashed to a single commit (described [here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html)) - [x] Build is green in Travis CI - [X] You have certified that the contribution is your original work and that you license the work to the project under the [Apache 2 license](https://github.com/nats-io/nats-server/blob/main/LICENSE) ### Changes proposed in this pull request: Fixes a race condition in some leader failover scenarios leading to messages being potentially sourced more than once. In some failure scenarios where the current leader of a stream sourcing from other stream(s) gets shutdown while publications are happening on the stream(s) being sourced leads to `setLeader(true)` being called on the new leader for the sourcing stream before all the messages having been sourced by the previous leader are completely processed such that when the new leader does it's reverse scan from the last message in it's view of the stream in order to know what sequence number to start the consumer for the stream being sourced from, such that the last message(s) sourced by the previous leader get sourced again, leading to some messages being sourced more than once. The existing `TestNoRaceJetStreamSuperClusterSources` test would sidestep the issue by relying on the deduplication window in the sourcing stream. Without deduplication the test is a flapper. This avoid the race condition by adding a small delay before scanning for the last message(s) having been sourced and starting the sources' consumer(s). Now the test (without using the deduplication window) never fails because more messages than expected have been received in the sourcing stream. (Also adds a guard to give up if `setupSourceConsumers()` is called and we are no longer the leader for the stream (that check was already present in `setupMirrorConsumer()` so assuming it was forgotten for `setupSourceConsumers()`)	2023-09-28 11:22:20 -07:00
Jean-Noël Moyne	71f96881ab	[FIXED] Race condition in some leader failover scenarios leading to messages being potentially sourced more than once. - In some failure scenarios where the current leader of a stream sourcing from other stream(s) gets shutdown while publications are happening on the stream(s) being sourced leads to `setLeader(true)` being called on the new leader for the sourcing stream before all the messages having been sourced by the previous leader are completely processed such that when the new leader does it's reverse scan from the last message in it's view of the stream in order to know what sequence number to start the consumer for the stream being sourced from, such that the last message(s) sourced by the previous leader get sourced again, leading to some messages being sourced more than once. The existing `TestNoRaceJetStreamSuperClusterSources` test would sidestep the issue by relying on the deduplication window in the sourcing stream. Without deduplication the test is a flapper. This avoid the race condition by adding a small delay before scanning for the last message(s) having been sourced and starting the sources' consumer(s). Now the test (without using the deduplication window) never fails because more messages than expected have been received in the sourcing stream. - Fix test TestJetStreamWorkQueueSourceRestart that expects the sourcing stream to get all of the expected messages right away by adding a small sleep before checking the number of messages pending on the consumer for that stream. Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>	2023-09-28 10:50:54 -07:00
Derek Collison	f5803ef20e	[FIXED] Routes: Pinned Accounts connect/reconnect in some cases (#4602 ) The issue is with a server that has a route for a given account but connects to a server that does not support it. The creation of the route for this account will fail - as expected - and the server will stop trying to create the route for this account. But it needs to retry to create this route if it were to reconnect to that same URL in case the server (or its config) is updated to support a route for this account. There was also an issue even with 2.10.0 servers in some gossip situations. Namely, if server B is soliciting connections to A (but not vice-versa) and A would solicit connections to C (but not vice-versa). In this case, connections for pinned-accounts would not be created. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-09-28 10:47:58 -07:00
Phil Pennock	47db17a4c8	systemd: use SIGUSR2 for shutdown, for LDM	2023-09-28 13:16:48 -04:00
Ivan Kozlovic	1eb08505d4	[FIXED] Routes: Pinned Accounts connect/reconnect in some cases The issue is with a server that has a route for a given account but connects to a server that does not support it. The creation of the route for this account will fail - as expected - and the server will stop trying to create the route for this account. But it needs to retry to create this route if it were to reconnect to that same URL in case the server (or its config) is updated to support a route for this account. There was also an issue even with 2.10.0 servers in some gossip situations. Namely, if server B is soliciting connections to A (but not vice-versa) and A would solicit connections to C (but not vice-versa). In this case, connections for pinned-accounts would not be created. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-09-28 10:46:32 -06:00
Derek Collison	9c96576066	Bump to 2.10.2-RC.9 Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 20:49:55 -07:00
Derek Collison	89c2c844a2	[IMPROVED] Additional markers for dirty state (#4601 ) Under certain circumstances we could delay recovery if the state file pointed to an absent msg block. Found additional places to mark dirty and optionally kick the flusher. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 20:48:32 -07:00
Derek Collison	b0743ec059	Additional markers for dirty state Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 20:32:17 -07:00
Derek Collison	4c368876d8	[IMPROVED] Concurrent stream creation of the same stream could return not found (#4600 ) Here we know that if we can't find the stream but have the stream assignment, this is a distinct possibility. So we wait, since not processed inline, to see if it appears. Fixes TestJetStreamClusterParallelStreamCreation as well that could flap. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 19:55:52 -07:00
Derek Collison	a7ca71017b	When under load, concurrent stream creation of the same stream could return stream not found, which is odd. Here we know that if we can't find the stream but have the stream assignment, this is a distinct possibility. So we wait, since not processed inline, to see if it appears. Fixes TestJetStreamClusterParallelStreamCreation as well that could flap. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 18:05:43 -07:00
Derek Collison	bc012d78c9	[IMPROVED] Add in warnings for filestore recover state if happy path fails. (#4599 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 16:53:29 -07:00
Derek Collison	aeef0eff53	Add in warnings for filestore recover state if happy path fails. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 16:22:15 -07:00
Derek Collison	c6b26ab5d0	Miscellaneous JetStream benchmark improvements (#4595 ) - [ ] Link to issue, e.g. `Resolves #NNN` - [ ] Documentation added (if applicable) - [ ] Tests added - [x] Branch rebased on top of current main (`git pull --rebase origin main`) - [ ] Changes squashed to a single commit (described [here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html)) - [ ] Build is green in Travis CI - [x] You have certified that the contribution is your original work and that you license the work to the project under the [Apache 2 license](https://github.com/nats-io/nats-server/blob/main/LICENSE) Resolves # ### Changes proposed in this pull request: Miscellaneous fixes and improvements to server JetStream benchmarks. Reviewers: notice the PR is broken down in 5 commit, each one is trivial to review individually, but they can be definitely squashed before merging for easier cherry-picking.	2023-09-27 14:16:12 -07:00
Derek Collison	46c417f4c9	Bump to 2.10.0-RC.8 Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 12:08:45 -07:00
Derek Collison	fc5bccd2ca	Updated Go client (#4597 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 12:08:18 -07:00
Derek Collison	cbc490ab56	Don't take sublist write lock in `match` if sublist cache disabled (#4594 ) We may be creating unnecessary lock contention on the sublist when the cache is disabled by taking the write lock anyway. Signed-off-by: Neil Twigg <neil@nats.io>	2023-09-27 10:19:14 -07:00
Derek Collison	b3f5bac31a	Update for Go client Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 09:55:38 -07:00
Marco Primi	d31236cea2	Refactor cluster creation for JS benchmarks	2023-09-27 09:26:11 -07:00
Marco Primi	be106d1ee5	Remove artificial limit on minimum number of operations	2023-09-27 09:26:11 -07:00
Marco Primi	c5698a9435	Cleanup unnecessary calls to setBytes in JS benchmarks	2023-09-27 09:26:11 -07:00
Marco Primi	e108096601	Improve JS asynchronous publish benchmark Simplify logic and make sure no more than `asyncWindow` messages are ever in-flight	2023-09-27 09:26:11 -07:00
Marco Primi	03aa44dc3d	Improve setup of JS Consume benchmark Handle error condition during stream setup that was resulting in failed runs.	2023-09-27 09:26:11 -07:00
Neil Twigg	02d48ddd00	Don't take sublist write lock in `match` if sublist cache disabled Signed-off-by: Neil Twigg <neil@nats.io>	2023-09-27 16:33:58 +01:00
Derek Collison	4c17eeb79e	[IMPROVED] ServiceImport Reply Optimizations (#4591 ) We added some small performance tweak to the func checkForReverseEntries. In addition, we move the shutdown bool for the server to an atomic so we could efficiently check it when doing unsubs. If the server is going away there is really no need since the other side will do its own thing when the connection goes away. And finally we do not have to range over the account rrMap if the subscription going away is a reserved reply. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-27 08:07:56 -07:00
Neil	1700f56856	Fix `TestNoRaceJetStreamStreamInfoSubjectDetailsLimits` for changes in nats.go (#4593 ) There are changes in recent versions of nats.go that seemingly increase the size of the stream info and cause this test to fail consistently with `norace_test.go:4259: require no error, but got: nats: maximum payload exceeded`. Fix the test to use larger limits and payloads so we are not sensitive to this when nats.go is upgraded. Signed-off-by: Neil Twigg <neil@nats.io>	2023-09-27 12:59:12 +01:00
Neil Twigg	52b88fd94e	Fix `TestNoRaceJetStreamStreamInfoSubjectDetailsLimits` for changes in nats.go Signed-off-by: Neil Twigg <neil@nats.io>	2023-09-27 11:19:13 +01:00
Derek Collison	75236a5bcd	When unsubscribing do not check rrMap for reserved replies. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-26 21:43:36 -07:00
Derek Collison	4e0656f377	Small performance tweak to checkForReverseEntries. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-26 21:43:20 -07:00
Derek Collison	c5b98f5c79	Make server shutdown an atomic and check inside unsubscribe to avoid unnecessary work. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-26 17:53:58 -07:00
Derek Collison	aaf238121c	Remove commented out code Signed-off-by: Derek Collison <derek@nats.io>	2023-09-26 17:13:31 -07:00
Derek Collison	27049a9e93	Fix nats-general links (#4590 ) - [ ] Link to issue, e.g. `Resolves #NNN` - [ ] Documentation added (if applicable) - [ ] Tests added - [ ] Branch rebased on top of current main (`git pull --rebase origin main`) - [ ] Changes squashed to a single commit (described [here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html)) - [ ] Build is green in Travis CI - [ ] You have certified that the contribution is your original work and that you license the work to the project under the [Apache 2 license](https://github.com/nats-io/nats-server/blob/main/LICENSE) Resolves # ### Changes proposed in this pull request: - Fixes links to the `nats-general` repository.	2023-09-26 06:29:50 -07:00
Joe Henke	1e5b068585	Fix nats-general links	2023-09-26 07:37:57 -04:00
Derek Collison	c583f7fdc7	Bump to 2.10.2-RC.7 Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 21:05:56 -07:00
Derek Collison	ee4d6ee40e	[FIXED] Account resolver lock inversion (#4588 ) There was a lock inversion but low risk since it happened during server initialization. Still fixed it and added the ordering in locksordering.txt file. Also fixed multiple lock inversions that were caused by tests. Signed-off-by: Ivan Kozlovic <ijkozlovic@gmail.com>	2023-09-25 21:05:11 -07:00
Derek Collison	3056af06d2	[FIXED] JetStream: stream assignment data race (#4589 ) Two go routines could possibly execute the stream assignment at the same time. A WaitGroup was used to prevent that, but an issue caused the data race and possible concurrent execution. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-09-25 20:59:24 -07:00
Ivan Kozlovic	ca2a961fa7	[FIXED] JetStream: stream assignment data race Two go routines could possibly execute the stream assignment at the same time. A WaitGroup was used to prevent that, but an issue caused the data race and possible concurrent execution. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-09-25 16:11:09 -06:00
Ivan Kozlovic	a84ce61a93	[FIXED] Account resolver lock inversion There was a lock inversion but low risk since it happened during server initialization. Still fixed it and added the ordering in locksordering.txt file. Also fixed multiple lock inversions that were caused by tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-09-25 15:09:11 -06:00
Derek Collison	83cc80ab74	Bump to 2.10.2-RC.6 Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 13:33:03 -07:00
Derek Collison	73f8f87b86	[IMPROVED] The func subjectIsSubsetMatch() is heavy so do without the account lock. (#4586 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 13:32:38 -07:00
Neil	62faa1882d	Add `prof_block_rate` option for enabling/configuring the block profile (#4587 ) The new `prof_block_rate` configuration option allows the block profiler to be enabled on demand after it was previously disabled in #4402. The option is also reloadable so that it can be changed after startup. Signed-off-by: Neil Twigg <neil@nats.io>	2023-09-25 21:28:36 +01:00
Derek Collison	fb4e97e2ec	If we know bigger go ahead an allocate. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 13:12:13 -07:00
Neil Twigg	11feadfe7b	Add `prof_block_rate` option for enabling/configuring the block profile Signed-off-by: Neil Twigg <neil@nats.io>	2023-09-25 21:04:25 +01:00
Derek Collison	382da48180	The func subjectIsSubsetMatch() is heavy so do without the account lock. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 13:01:46 -07:00
Derek Collison	54d4640e8b	Bump to 2.10.2-RC.5 Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 12:32:51 -07:00
Derek Collison	2e12b875d3	[IMPROVED] Move some contended locks to atomic.Bools (#4585 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 12:31:33 -07:00
Derek Collison	a0029181ae	Fix datarace Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 12:04:42 -07:00
Derek Collison	b70f874640	Moved to atomics to detect if we have mapped subjects for an account since check for each inbound message. If an account has many connections on a server under heavy load this could be contended. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 11:43:34 -07:00
Derek Collison	7ce47fd182	Move server running state to atomic to avoid contention at NRG layer. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 11:18:15 -07:00

1 2 3 4 5 ...

8338 Commits