nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-16 19:14:41 -07:00

Author	SHA1	Message	Date
Derek Collison	087a28a13e	When creating replicated mirrors where the source stream had a very large starting sequence number, the server would use excessive CPU and Memory. This is due to the mirroring functionality trying to skip messages when it detects a gap. In a replicated stream this puts excessive stress on the raft system. This step is not needed at all if the mirror stream has no messages, we can simply jump ahead. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-15 17:20:15 -07:00
Derek Collison	9eeffbcf56	Fix performance issues with checkAckFloor. Bail early if new consumer, meaning stream sequence floor is 0. Decide which linear space to scan. Do no work if no pending and we just need to adjust which we do at the end. Also realized some tests were named wrong and were not being run, or were in wrong file. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-08 18:45:03 -07:00
Neil Twigg	d7ae2cbb5f	Backport #4120 to `main` Signed-off-by: Neil Twigg <neil@nats.io>	2023-05-09 11:24:35 +01:00
Ivan Kozlovic	95e4f2dfe1	Fixed accounts configuration reload Issues could manifest with subscription interest not properly propagated. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-03 14:35:06 -06:00
Derek Collison	c15cc0054a	When a fleet of leafnodes are isolated (not routed but using same cluster) we could do better at optimizing how we update the other leafnodes. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-30 17:08:16 -07:00
Derek Collison	3340179b97	Fix flapper Signed-off-by: Derek Collison <derek@nats.io>	2023-04-24 22:22:27 -07:00
Derek Collison	aee73a9c77	Fix flapping test Signed-off-by: Derek Collison <derek@nats.io>	2023-04-08 21:58:54 -07:00
Derek Collison	ffc49b8f86	Fix flapping test and data race in test Signed-off-by: Derek Collison <derek@nats.io>	2023-04-08 08:13:31 -07:00
Derek Collison	07b34f707f	Make sure to never process next message requests inline Signed-off-by: Derek Collison <derek@nats.io>	2023-04-03 20:50:01 -07:00
Derek Collison	94278e731a	More tweaks to test due to slow network proxy being more accurate Signed-off-by: Derek Collison <derek@nats.io>	2023-04-02 19:57:34 -07:00
Derek Collison	5afcb6c13c	Fix for flapping test, network proxy more accurate now so rtt needed to be tweaked Signed-off-by: Derek Collison <derek@nats.io>	2023-04-02 19:06:42 -07:00
Derek Collison	d5ac4d283a	Fix for flapping test, can return invalid sequence as well Signed-off-by: Derek Collison <derek@nats.io>	2023-04-02 16:18:23 -07:00
Derek Collison	1fb1efd748	Make sure to remove any inflight entries when done Signed-off-by: Derek Collison <derek@nats.io>	2023-04-02 14:41:49 -07:00
Derek Collison	e6447c982a	Protect against concurrent creation of streams and consumers. Also make sure we have exited monotoring routines when doing resets for both streams and consumers. Signed-off-by: Derek Collison <derek@nats.io>	2023-04-02 14:29:52 -07:00
Derek Collison	b5358fa4b3	Wait for shutdown and sleep to let state build up Signed-off-by: Derek Collison <derek@nats.io>	2023-04-02 03:53:05 -07:00
Derek Collison	ad5bb366a0	Updates to preacks when multiple consumers are present but mutually exlusive (filtered). Signed-off-by: Derek Collison <derek@nats.io>	2023-03-31 10:43:28 -07:00
Derek Collison	5e85889790	[IMPROVED] Improvements to preAcks. (#4006 ) Better handling of multiple consumers so as to not delete messages too early. Better cleanup handling. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-30 21:08:34 -07:00
Derek Collison	937ef0d2a6	Improvements to preAcks. Better handling of multiple consumers so as to not delete too early. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-30 20:29:15 -07:00
Ivan Kozlovic	a4df4f8727	Fixed some tests Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-03-30 15:02:59 -06:00
Derek Collison	873ab0f6b9	Fix for flapping test Signed-off-by: Derek Collison <derek@nats.io>	2023-03-29 18:55:41 -07:00
Derek Collison	c546828359	Moved log running test to NoRace suite Signed-off-by: Derek Collison <derek@nats.io>	2023-03-29 16:56:04 -07:00
Derek Collison	e97ddcd14f	Tweak tests due to changes, make test timeouts uniform. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-29 12:43:59 -07:00
Derek Collison	0d9f707b4b	Additional tests to stress interest based streams with pull subscribers during rolling restarts. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-29 12:43:55 -07:00
Derek Collison	9ccd7abdf8	Test for preAcks Signed-off-by: Derek Collison <derek@nats.io>	2023-03-21 12:08:24 -07:00
Derek Collison	ed9de4b0a1	Improved publisher performance under some instances of asymmetric network latency clusters on interest based streams. Under asymmetric network latency based clusters, if a node in an R3 was replicating a consumer and the parent stream, but was the leader of neither, but the path from the stream leader was faster then the consumer leader a replicated ack could arrive before the message itself. In this case we used to forward a delete message request to the stream leader which would then replicate that to all stream replicas, causing more work which could lead to increased publisher times on clients connected to the slow node. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-20 20:53:45 -07:00
Derek Collison	5a16f98427	Fixed an off by one bug that under certain circumstances could cause large consumer replica states. This could lead to instability in the system. The bug would manifest in replicated consumers when certain messages could be acked out of order, and, the pending list would never go to zero. Signed-off-by: Derek Collison <derek@nats.io>	2023-03-19 10:41:59 -07:00
Derek Collison	ebe08040e9	Attempt to fix flapper again Signed-off-by: Derek Collison <derek@nats.io>	2023-03-01 06:24:51 -08:00
Derek Collison	baca7bd751	Fix for test flapper Signed-off-by: Derek Collison <derek@nats.io>	2023-03-01 04:58:01 -08:00
Derek Collison	2642a8c03d	Optimize locking for when under heavy loads. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-27 18:56:55 -08:00
Derek Collison	d347cb116a	When becoming leader optionally send current snapshot to followers if caught up. This can help sync on restarts and improve ghost ephemerals. Also added more code to suppress respnses and API audits when we know we are recovering. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-23 10:30:36 -08:00
Derek Collison	2972c11be6	Improve consumer create performance. In cases where we had a large subject space, a filestore with many msg blocks, and a filtered consumer with a wildcard filtered subject, creating a consumer could take more memory and time then we wanted. This improvement works when the consumer is DeliverAll and we used the upper layer in memory psim structure to scan but only in memory and avoid a file read for each msg block. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-22 19:42:02 -08:00
Derek Collison	f16a7d8559	Skip test for now Signed-off-by: Derek Collison <derek@nats.io>	2023-02-22 15:49:48 -08:00
Derek Collison	d03d8e9d93	When having a max msgs per subject (e.g. KV) under heavy concurrent usage could skew the accounting for the underlying filestore. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-22 12:50:43 -08:00
Derek Collison	11b0f214d0	Do not re-calculate NumPending on consumer info calls. We noticed this was being called alot in user environments. When the consumer was filtered with a wilcard and the stream had a high cardinality of subjects and was falling behind this could take a substantial amount of time. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-16 16:30:14 -08:00
Derek Collison	32b5ec16dd	Fixed test to correspond to new limit of 1024. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-16 07:16:19 +04:00
Derek Collison	1e3c2810f4	Improve expireMsgs minAge calculation for when lots of messages to expire in each callback. This happens when under extreme load as shown in the skipped test. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-13 18:39:39 +02:00
Derek Collison	e9a983c802	Do not let !NeedSnapshot() avoid snapshots and compaction. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-01 22:05:25 -07:00
Derek Collison	390fd02918	Updates to tests for updated Go client changes Signed-off-by: Derek Collison <derek@nats.io>	2023-01-31 09:47:36 -08:00
Ivan Kozlovic	79ca0c1787	Move test to "norace_test.go" The test TestJetStreamClusterConsumerListPaging was in the jetstream_cluster_3_test.go and because of `-race` flag would take more than 440 seconds (7+ minutes) as seen here: https://app.travis-ci.com/github/nats-io/nats-server/jobs/593984385#L335 Without the `-race` flag, this test takes ~17 seconds. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-01-23 17:05:18 -07:00
Neil Twigg	14d0ba1c65	Fix some lint errors after move to `golangci-lint`	2022-12-30 20:00:08 +00:00
Derek Collison	c90fe9a2fa	Improve performance and latency with large number of sparse consumers. When a stream had a large number of consumers on a server that were sparse, the signaling mechanism would do a linear scan to signal matching consumers. As usage patterns have continued to have more consumers that are filteres and sparse, meaning a message is destined for a single or small number of consumers. This change moves selection to a sublist that tracks only active consumer leaders for selection, which optimizes selection of consumers to signal when the number of consumers is large. Signed-off-by: Derek Collison <derek@nats.io>	2022-12-13 15:25:55 -08:00
Marco Primi	f8a030bc4a	Use testing.TempDir() where possible Refactor tests to use go built-in temporary directory utility for tests. Also avoid binding to default port (which may be in use)	2022-12-12 13:18:44 -08:00
Derek Collison	894115b82b	Fix for server panic when consumer state was not decoded correctly. The bug was when a timestamp for the pending state was exactly -1 which could happen based on timing of the redlivered pending items which would set pending.Timestamp into the future potentially and the timing on the encodeConsumerState call. Minor fixes to raft. Signed-off-by: Derek Collison <derek@nats.io>	2022-12-06 14:16:20 -08:00
Derek Collison	9f241f3322	Offload signaling to consumers when number is large. Signed-off-by: Derek Collison <derek@nats.io>	2022-11-15 11:25:07 -08:00
Derek Collison	4dab6ce92c	Fix test timing Signed-off-by: Derek Collison <derek@nats.io>	2022-11-09 19:44:22 -08:00
Derek Collison	c6031382a1	Fix for #3499 When we deleted a consumer from an interest policy stream we would make sure to clean up any unacked messages. However we only based start from the ack floor for the consumer and did not take into account the first sequence of the stream. Signed-off-by: Derek Collison <derek@nats.io>	2022-11-05 13:56:45 -07:00
Ivan Kozlovic	170ff49837	[ADDED] JetStream: peer (the hash of server name) in statsz/jsz A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something like this: ``` ... "meta_cluster": { "name": "local", "leader": "A", "peer": "NUmM6cRx", "replicas": [ { "name": "B", "current": true, "active": 690369000, "peer": "b2oh2L6w" }, { "name": "Server name unknown at this time (peerID: jZ6RvVRH)", "current": false, "offline": true, "active": 0, "peer": "jZ6RvVRH" } ], "cluster_size": 3 } ``` Note the "peer" field following the "leader" field that contains the server name. The new field is the node ID, which is a hash of the server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-16 15:31:37 -06:00
Derek Collison	6c97733bb8	Optimize needAck. Signed-off-by: Derek Collison <derek@nats.io>	2022-09-14 16:25:50 -07:00
Derek Collison	d979937bbd	Merge pull request #3456 from nats-io/max-bytes-pull [IMPROVED] Pull request logic	2022-09-08 12:08:10 -07:00
Derek Collison	dedf21d45d	Fix for issue #3455 When hitting max ack pending from getNextMsg would remove one shots incorrectly. Signed-off-by: Derek Collison <derek@nats.io>	2022-09-08 11:56:57 -07:00

1 2 3 4

188 Commits