nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-17 03:24:40 -07:00

Author	SHA1	Message	Date
Derek Collison	2fc3f45ea1	[FIXED] Durable pull consumers could get cleaned up incorrectly on leader change. (#4412 ) Fix for a bug that would allow old leaders of pull based durables to delete a consumer from an inactivity threshold timer inadvertently. Signed-off-by: Derek Collison <derek@nats.io>	2023-08-21 15:35:44 -07:00
Derek Collison	43314fd439	Fix for a bug that would allow old leaders of pull based durables to delete a consumer from an inactivity threshold. Signed-off-by: Derek Collison <derek@nats.io>	2023-08-21 14:53:09 -07:00
Neil Twigg	7cc5838a6d	Send shutdown event on LDM so that R1 assets do not get assigned to the LDM node Signed-off-by: Neil Twigg <neil@nats.io>	2023-08-21 21:29:01 +01:00
Neil Twigg	c0636d117f	Tweak consumer replica scaling, add unit test for orphaned consumer subjects Signed-off-by: Neil Twigg <neil@nats.io>	2023-08-17 15:27:29 +01:00
Derek Collison	081140ee67	When taking over make sure to sync and reset clfs for clustered streams. Signed-off-by: Derek Collison <derek@nats.io>	2023-08-03 10:41:10 -07:00
Derek Collison	5c8db89506	Make sure we do not drift on accounting. Three issues were found and resolved. 1. Purge replays after recovery could execute full purge. 2. Callback was registered without lock, which could lead to skew. 3. Cluster reset could stop stream store and recreate it, which could lead to double accounting. Signed-off-by: Derek Collison <derek@nats.io>	2023-08-01 18:35:20 -07:00
Neil Twigg	979b265e26	Tweak timing in `TestJetStreamClusterDeleteConsumerWhileServerDown` Signed-off-by: Neil Twigg <neil@nats.io>	2023-07-14 16:44:15 +01:00
Derek Collison	9e9a9a082b	When restoring a filestore with no key generator but it was encrypted, fail to restore. Signed-off-by: Derek Collison <derek@nats.io>	2023-07-11 16:27:50 -07:00
Derek Collison	a2b9ee9123	Shorten stream size for travis Signed-off-by: Derek Collison <derek@nats.io>	2023-06-28 15:56:41 -07:00
Derek Collison	1bb1a3cae1	Do not health check streams that are actively being restored. Could leave them in a bad state. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-28 15:27:45 -07:00
Derek Collison	9eeffbcf56	Fix performance issues with checkAckFloor. Bail early if new consumer, meaning stream sequence floor is 0. Decide which linear space to scan. Do no work if no pending and we just need to adjust which we do at the end. Also realized some tests were named wrong and were not being run, or were in wrong file. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-08 18:45:03 -07:00
Derek Collison	779978d817	Extended replay leafnode test to confirm mirror functionality Signed-off-by: Derek Collison <derek@nats.io>	2023-06-07 14:01:43 -07:00
Derek Collison	4ac45ff6f3	When consumers were R1 and the same name was reused, server restarts could try to cleanup old ones and effect the new ones. These changes allow consumer name reuse more effectively during server restarts. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-05 12:48:18 -07:00
Maurice van Veen	132567de39	Fix PurgeEx replay with sequence & keep succeeds	2023-06-04 11:56:28 +02:00
Derek Collison	dee532495d	Make sure to process extended purge operations correctly when being replayed on a restart. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-03 17:49:45 -07:00
Derek Collison	1bce79750e	When we were optimizing for single cluster but large number of leafnodes we inadvertently broke a daisy chained scenarion where a server was a spoke and a hub with a single hub cluster. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-02 15:16:36 -07:00
Derek Collison	734895ae47	Fix test flapper Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 12:20:18 -07:00
Derek Collison	b0340ce598	Make sure to wait properly until we believe we are caught up to enable direct gets. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-16 11:02:06 -07:00
Derek Collison	5e029d08d5	For older R1 streams created by previous servers we could have no cluster for the stream assignment group which would prevent scale up with newer servers. This will inherit cluster if detected from placement tags or client cluster designation. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-10 17:59:28 -07:00
Derek Collison	da8aeac91b	Fix flapper Signed-off-by: Derek Collison <derek@nats.io>	2023-05-03 21:00:17 -07:00
Derek Collison	21239022bd	Protect against usage drift for any unforseen reason and if detected correct. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-03 17:09:06 -07:00
Derek Collison	f098c253aa	Make sure we adjust accounting reservations when deleting a stream with any issues. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-01 15:54:37 -07:00
Derek Collison	f5ac5a4da0	Fix for a bug that could leave a raft node running when stopping a stream. This can happen when we reset a stream internally and the stream had a prior snapshot. Also make sure to always release resources back to the account regardless if the store is no longer present. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-01 13:22:06 -07:00
Derek Collison	546dd0c9ab	Make sure we can recover an underlying node being stopped. Do not return healthy if the node is closed, and wait a bit longer for forward progress. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-29 07:42:23 -07:00
Derek Collison	d107ba3549	Under certain scenarios we have witnessed healthz() that never retrun healthy due to a stream or consumer being missing or stopped. This will now allow the healthy call to attempt to restart those assets. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-28 17:11:08 -07:00
Derek Collison	7f06d6f5a7	When Jsz() was asked for consumer details, would report incorrect data if not a consumer leader. This is due to the way state is maintained for leaders vs followers for consumers. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-26 15:03:15 -07:00
Derek Collison	c0f5b71a8f	Test that makes sure that assets that have been created under a certain cluster can be upgraded to a new cluster. This is specifically when a cluster is reconfigured and the servers are restarted with a new cluster name. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-24 20:06:20 -07:00
Derek Collison	8b7c2d12aa	Run a check for ack floor drift when taking over as a leader and the ack go routine is spun up. Also periodically check. If all normal will be very cheap. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-21 11:59:35 -07:00
Derek Collison	7d3ec51d79	Fix for flapping test Signed-off-by: Derek Collison <derek@nats.io>	2023-04-03 14:46:59 -07:00
Ivan Kozlovic	a4df4f8727	Fixed some tests Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-03-30 15:02:59 -06:00
Derek Collison	4646f4af5d	Do not allow any JetStream leaders to be placed on a lameduck server Signed-off-by: Derek Collison <derek@nats.io>	2023-03-29 20:15:41 -07:00
Derek Collison	02702e4620	[IMPROVEMENT] General stability and bug fixes. (#3999 ) This PR has general improvements and fixes to filestore, raft, and the clustering layer. Summary 1. Additional support for preAck handling for interest based streams when replicated acks arrive before the message itself. 2. Better handling when checking state to determine whether to remove an interest based message. 3. Improved StepDown() and leadership transfer handling after restarts. 4. Improved voting logic for high load systems. 5. Various improvements and fixes for filestore Compact(), which is used heavily in the raft layer when updating snapshots and the raft wal. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-29 17:09:44 -07:00
Derek Collison	182bf6cbae	Bug fixes and general stability improvements. 1. If reset ignore Applied() that are greater then our commit. 2. Improved StepDown() by placing at back of queue if preferred. 3. Improved handling of leadership transfer during StepDown(). 4. Do not store EntryLeaderTransfer records on disk. 5. Remove un-needed processing of older terms. 6. If append entry has higher term, also inherit pterm. 7. Only inherit a candidate's term if we decide to vote for them. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-29 12:43:46 -07:00
Neil Twigg	8d5519356e	Shut down RAFT groups when disabling JetStream Signed-off-by: Neil Twigg <neil@nats.io>	2023-03-23 16:54:01 +00:00
Derek Collison	9ccd7abdf8	Test for preAcks Signed-off-by: Derek Collison <derek@nats.io>	2023-03-21 12:08:24 -07:00
Derek Collison	5a16f98427	Fixed an off by one bug that under certain circumstances could cause large consumer replica states. This could lead to instability in the system. The bug would manifest in replicated consumers when certain messages could be acked out of order, and, the pending list would never go to zero. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-19 10:41:59 -07:00
Derek Collison	f0e1585490	Fix flapping test Signed-off-by: Derek Collison <derek@nats.io>	2023-03-17 13:14:43 -07:00
Derek Collison	5bb6f167b9	Make sure to cleanup messages on a follower consumer for an interest based stream when the consumer leader sends a state snapshot. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-15 20:11:16 -07:00
Derek Collison	8dbfbbe577	Fix test Signed-off-by: Derek Collison <derek@nats.io>	2023-03-15 17:23:51 -07:00
Derek Collison	5a1878b015	Fix for workqueue stream scaling up and not removing acked messages. Make sure when scaling up streams that are workqueue or interest policy that consumers scale as well. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-13 17:13:49 -07:00
Derek Collison	724160ebac	Fix flapping tests Signed-off-by: Derek Collison <derek@nats.io>	2023-02-28 14:30:23 -08:00
Derek Collison	6078706544	Fixup test for new parameters Signed-off-by: Derek Collison <derek@nats.io>	2023-02-27 18:56:55 -08:00
Tomasz Pietrek	02ba78454d	Fix new replicas late MaxAge expiry This commit fixes the issue when scaling Stream with MaxAge and some older messages stored. Until now, old messages were not properly expired on new replicas, because new replicas first expiry timer was set to MaxAge duration. This commit adds a check if received messages expiry happens before MaxAge, meaning they're messages older than the replica. https://github.com/nats-io/nats-server/issues/3848 Signed-off-by: Tomasz Pietrek <tomasz@nats.io>	2023-02-24 00:46:02 +01:00
Neil Twigg	cfea34c80c	Install snapshot and compact when WAL grows, even when no state changes occur	2023-02-22 20:00:57 +00:00
Tomasz Pietrek	337a9f2cbd	Improve test for consumer with inactivity threshold Signed-off-by: Tomasz Pietrek <tomasz@nats.io>	2023-02-19 17:57:09 +01:00
Derek Collison	06fd81d096	Fixed a bug where a named consumer under interest policy was spinning up inactive threshold timers in all replicas not just the leader. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-19 06:08:43 -08:00
Derek Collison	6a4c61e1a3	Merge branch 'main' into bad-consumer-delete	2023-02-18 11:09:56 -08:00
Derek Collison	01fa89a0b4	Fix for deleting consumers on restarts and non-fatal update errors. If there was a spurious error on restart, or possibly on an update, we could delete a consumer which was the incorrect behavior. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-18 09:46:52 -08:00
Derek Collison	efa3bcc49d	Parallel consumer creation could drop responses (create and info) and could also run monitorConsumer twice. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-18 05:16:05 -08:00
Waldemar Quevedo	4452f64d73	Fix TestJetStreamParallelConsumerCreation race Signed-off-by: Waldemar Quevedo <wally@nats.io>	2023-02-15 17:23:48 -08:00

1 2 3

103 Commits