nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-17 03:24:40 -07:00

Author	SHA1	Message	Date
Derek Collison	d3bde2c6e1	Send peer state when adding peers (#4224 ) Currently `UpdateKnownPeers` doesn't send a peer state when a single peer add operation is taking place, but it seems like this can potentially race when there are lots of changes to the replica count happening in rapid succession. Sending the peer state in all cases seems to fix this issue and, so far in my testing, fixes the failground stream update replicas test. Signed-off-by: Neil Twigg <neil@nats.io>	2023-06-08 09:31:49 -07:00
Neil Twigg	6d9955d212	Send peer state when adding peers Signed-off-by: Neil Twigg <neil@nats.io>	2023-06-08 15:25:18 +01:00
Derek Collison	c85db15a2c	Update Go to 1.19.10 (#4223 )	2023-06-08 04:39:20 -07:00
Byron Ruth	3ca99adcca	Update Go to 1.19.10 Signed-off-by: Byron Ruth <byron@nats.io>	2023-06-08 07:10:27 -04:00
Derek Collison	779978d817	Extended replay leafnode test to confirm mirror functionality Signed-off-by: Derek Collison <derek@nats.io>	2023-06-07 14:01:43 -07:00
Derek Collison	822ad00d50	Bump to 2.9.18-beta.1 Signed-off-by: Derek Collison <derek@nats.io>	2023-06-05 14:14:35 -07:00
Derek Collison	2e2ac33920	[IMPROVED] When R1 consumers were recreated with the same name when they became inactive. (#4216 ) When consumers were R1 and the same name was reused, server restarts could try to cleanup old ones and effect the new ones. These changes allow consumer name reuse more effectively during server restarts. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-05 14:04:53 -07:00
Derek Collison	df5df3ce99	Panic fixes (#4214 ) - [ ] Link to issue, e.g. `Resolves #NNN` - [ ] Documentation added (if applicable) - [ ] Tests added - [ ] Branch rebased on top of current main (`git pull --rebase origin main`) - [ ] Changes squashed to a single commit (described [here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html)) - [x] Build is green in Travis CI - [x] You have certified that the contribution is your original work and that you license the work to the project under the [Apache 2 license](https://github.com/nats-io/nats-server/blob/main/LICENSE) Resolves panics in the code. ### Changes proposed in this pull request: - This PR fixes some of the panics in the code	2023-06-05 13:02:05 -07:00
Derek Collison	4ac45ff6f3	When consumers were R1 and the same name was reused, server restarts could try to cleanup old ones and effect the new ones. These changes allow consumer name reuse more effectively during server restarts. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-05 12:48:18 -07:00
Nikita Mochalov	5141b87dff	Refactor code	2023-06-05 22:42:28 +03:00
Nikita Mochalov	4c181bc99a	Use sentinel error	2023-06-05 22:41:09 +03:00
Nikita Mochalov	f71c49511b	Fix client panic on absent server field	2023-06-05 15:27:45 +03:00
Derek Collison	64e3bf82ed	Fix PurgeEx replay with sequence & keep succeeds (#4213 ) PR https://github.com/nats-io/nats-server/pull/4212 fixed the issue I reported in https://github.com/nats-io/nats-server/issues/4196. However, I believe there might be a bug when both `sequence` and `keep` are set during recovery. In the `PurgeEx` the following check is done (for both `filestore.go` and `memstore.go`): ```go if sequence > 1 && keep > 0 { return 0, ErrPurgeArgMismatch } ``` The `TestJetStreamClusterPurgeExReplayAfterRestart` also triggers this case, meaning that during the test this error is returned but it succeeds because the purge was already performed. Is this intended behaviour? To elaborate a bit more, I believe the following happens: - when running the purge normally it will properly run the `keep` (since it's not combined with `sequence` yet) - when replaying the purge though, the `sequence` is added to the `keep`, which errors out in the above if Which means that during normal operation all will be well, but purges with `keep` will be ignored upon replaying. I'm proposing to remove the `sequence > 1 && keep > 0` check and subsequent error. Which, for reference, was introduced in https://github.com/nats-io/nats-server/pull/3121. Hoping this ensures that during recovery, purges that haven't executed yet will still be executed. An alternative approach, which wouldn't remove the error: not allow combining `sequence` and `keep` normally and only allowing it during recovery. Which would preserve the current behaviour, and correctly apply `sequence+keep` during recovery still. However, not sure if it's possible to know if we're in "recovery mode" from within `PurgeEx`. Resolves https://github.com/nats-io/nats-server/issues/4196	2023-06-04 13:29:53 -07:00
Maurice van Veen	132567de39	Fix PurgeEx replay with sequence & keep succeeds	2023-06-04 11:56:28 +02:00
Derek Collison	e1f8064e9e	[FIXED] Make sure to process extended purge operations correctly when being replayed. (#4212 ) This is an extension to the excellent work by @MauriceVanVeen and his original PR #4197 to fully resolve for all use cases. Signed-off-by: Derek Collison <derek@nats.io> Resolves #4196	2023-06-03 18:12:22 -07:00
Derek Collison	dee532495d	Make sure to process extended purge operations correctly when being replayed on a restart. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-03 17:49:45 -07:00
Derek Collison	eb09ddd73a	[FIXED] Killed server on restart could render encrypted stream unrecoverable (#4210 ) When a server was killed on restart before an encrypted stream was recovered the keyfile was removed and could cause the stream to not be recoverable. We only needed to delete the key file when converting ciphers and right before we add the stream itself. Signed-off-by: Derek Collison <derek@nats.io> Resolves #4195	2023-06-03 17:36:10 -07:00
Derek Collison	449b429b58	[FIXED] Data races detected in internal testing (#4211 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-06-03 16:20:56 -07:00
Derek Collison	238282d974	Fix some data races detected in internal testing Signed-off-by: Derek Collison <derek@nats.io>	2023-06-03 13:58:15 -07:00
Derek Collison	4c1b93d023	Make sure to put the keyfile back if we did not recover the stream. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-03 11:21:58 -07:00
Derek Collison	d5ae96f54d	When a server was killed on restart before an encrypted stream was recovered the keyfile was removed and could cause the stream to not be recoverable. We only needed to delete the key file when converting ciphers and right before we add the stream itself. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-03 11:21:47 -07:00
Derek Collison	22c97d67ff	[FIXED] Daisy chained leafnodes sometimes would not propagate interest (#4207 ) When we were optimizing for single cluster and large numbers of leafnodes we inadvertently broke a daisy chained scenario where a server was a spoke and a hub within a single hub server. So interest on D would not propagate properly to server A as a publisher. ``` B / \ A C -- D (SUB) \| PUB ```	2023-06-02 16:43:21 -07:00
Derek Collison	b2ac621212	Bump to 2.9.18-beta (#4182 )	2023-06-02 16:40:23 -07:00
Derek Collison	1bce79750e	When we were optimizing for single cluster but large number of leafnodes we inadvertently broke a daisy chained scenarion where a server was a spoke and a hub with a single hub cluster. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-02 15:16:36 -07:00
Derek Collison	25ad3cd4af	Only check ack floor if we are interest policy based. (#4206 ) Saw performance issue with a user a limits based stream with large number of consumers. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-02 12:43:06 -07:00
Derek Collison	27bbfb7a85	Only check ack floor if we are interest policy based. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-02 11:04:00 -07:00
Artem Seleznev	27a8b96ee3	different panic fixes Signed-off-by: Artem Seleznev <seleznyov.artyom@gmail.com>	2023-06-02 13:19:22 +03:00
Byron Ruth	b24f0f393a	Bump to 2.9.18-beta Signed-off-by: Byron Ruth <byron@nats.io>	2023-05-18 14:22:22 -04:00
Waldemar Quevedo	4f2c9a5184	Prepare v2.9.17 release (#4181 ) Include fix with GoReleaser for nightly.	2023-05-18 11:19:34 -07:00
Byron Ruth	f3dac91d2a	Prepare v2.9.17 release Include fix with GoReleaser for nightly. Signed-off-by: Byron Ruth <byron@nats.io>	2023-05-18 13:57:40 -04:00
Derek Collison	25d9762ce2	[IMPROVED] Make health checks more consistent with stream health checks. (#4180 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-05-18 09:18:12 -07:00
Derek Collison	7e3f3f4908	Make health checks more consistent with stream health checks. Check for closed state on leader change for consumers. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-18 08:18:53 -07:00
Derek Collison	f63d63fbce	[IMPROVED] Stepdown on catchup request for something newer than our state (#4179 ) When we receive a catchup request for an item beyond our current state, we should stepdown. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 19:25:05 -07:00
Derek Collison	4fbc0ee563	Update to Go 1.19.9 (#4178 )	2023-05-17 18:01:58 -07:00
Byron Ruth	3a152a0e40	Update to Go 1.19.9 Signed-off-by: Byron Ruth <byron@nats.io>	2023-05-17 20:57:10 -04:00
Derek Collison	8e825001d2	When we receive a catchup request for an item beyond our current state, we should stepdown. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 17:30:35 -07:00
Derek Collison	7dfe5e528e	Bump to 2.9.17-RC.3 Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 16:46:10 -07:00
Derek Collison	93eaf8c814	Add workflow for stale issues (#4161 ) This adds a workflow to mark issues and PRs stale after the configured period of time, followed by closing the issue/PR after a subsequent period of time if there was no additional activity. The `debug-only` option is so currently, so even when merged, it will do a dry-run and not perform any actions. Once we inspect the initial logs of the effect of an initial run (impacting existing issues), we can adjust accordingly and then follow-up with making it active. For the debug logs to be enabled, we do need to add a repository secret named `ACTIONS_STEP_DEBUG` with a value set to `true` per [this instruction](https://github.com/marketplace/actions/close-stale-issues#debugging).	2023-05-17 16:45:17 -07:00
Derek Collison	94457e2d55	[IMPROVED] Reset logic for streams (#4177 ) When we detect conditions to reset streams, make sure we properly clean up old NRG nodes etc. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 16:45:00 -07:00
Derek Collison	b856bba285	[FIXED] Avoid deadlock with usage lock for an account during checkAndSyncUsage() (#4176 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 16:44:44 -07:00
Derek Collison	a8d7d3886e	Make sure to delete the stream assignment node here Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 16:19:39 -07:00
Derek Collison	44a5875968	Avoimd deadlock with usage lock for an account during checkAndSyncUsage(). Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 16:05:46 -07:00
Derek Collison	f3553791b1	Updates to stream reset logic. 1. When catching up do not try forever and if needed reset cluster state. 2. In checking if a stream is healthy check for node drift. 3. When restarting a stream make sure the current node is stopped. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-17 13:14:33 -07:00
Derek Collison	5db57fb053	Bump to 2.9.17-RC.2 Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 14:02:29 -07:00
Derek Collison	ac68a19530	[IMPROVED] Restart consumer behavior during healthz() checks. (#4172 ) Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 13:58:47 -07:00
Derek Collison	a06e1c9b43	Make sure to also stop nodes when dealing with consumer after stream restart Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 13:16:47 -07:00
Derek Collison	3752a6c500	Make sure to stop the node on a consumer restart if still running Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 12:49:46 -07:00
Derek Collison	734895ae47	Fix test flapper Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 12:20:18 -07:00
Derek Collison	87f17fcff4	[FIXED] Avoid stale KV reads on server restart for replicated KV stores. (#4171 ) Make sure to wait properly until we believe we are caught up to enable direct gets on followers. Signed-off-by: Derek Collison <derek@nats.io> Resolves #4162	2023-05-16 11:29:37 -07:00
Derek Collison	b0340ce598	Make sure to wait properly until we believe we are caught up to enable direct gets. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 11:02:06 -07:00

1 2 3 4 5 ...

7232 Commits