nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-15 18:50:41 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	0e841d4acf	Tweak ordered consumer flow control and bump to beta.18 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-04-14 17:43:43 -06:00
Derek Collison	04cce6df68	Merge pull request #3020 from nats-io/move-updates [IMPROVED] Raft layer for general stability and leader election.	2022-04-11 17:33:13 -07:00
Matthias Hanel	13e5ab10bd	fix js nex interest check where leaf node masked gw subj propagation (#3016 ) basically a gw subject propagation issue could be hidden behind a leaf node. also change error text when this was the case Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-04-11 14:04:09 -04:00
Derek Collison	37cbac99e7	Improvements to the raft layer for general stability and support of scale up and down and asset move. Also fixed a bug that would allow a leadership transfer when catching up. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-10 08:59:39 -07:00
Derek Collison	7e38ebcb6e	Allow assets such as streams and their associated consumers to migrate between clusters. The system will allow an update to a stream, and subsequently all attached consumers, to be placed in another cluster either directly or via tag placement. The meta layer will scale the underlying peerset appropriately to straddle the two clusters for both the stream and consumers, taking into account the consumer type. Control will then pass to the current leaders of the assets who will monitor the catchup status of the new peers. (Note we can optimize this later to only traverse once across a GW for any given asset, but for now this is simpler) Once the original leaders have determined the assets are synched it will pass leadership to a member of the new peerset. Once the new leader has been elected, it will forward a request for the meta layer to shrink the peerset by removing the old peers. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-04 18:28:36 -07:00
Matthias Hanel	92f4dc986a	added max_ack_pending setting to js account limits (#2982 ) * added max_ack_penind setting to js account limits because of the addition, defaults now have to be set later (depend on these new limits now) also re-organized the code to closer track how stream create looks Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-03-31 14:17:16 -04:00
Ivan Kozlovic	4ddbdbd74c	Rewrite trackDownAccountAndInterest() to make it easier to read Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-30 16:41:22 -06:00
Ivan Kozlovic	c0ab2d4959	[FIXED] Possible panic due to data races A panic was reported that looked like this: ``` fatal error: concurrent map read and map write goroutine 200 [running]: runtime.throw({0xa366ce, 0xe620e0}) /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/panic.go:1198 +0x71 fp=0xc00105f098 sp=0xc00105f068 pc=0x434ff1 runtime.mapaccess1_faststr(0x0, 0x0, {0xc0054b6f18, 0x11}) /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/map_faststr.go:21 +0x3a5 fp=0xc00105f100 sp=0xc00105f098 pc=0x412285" github.com/nats-io/nats-server/v2/server.(consumer).processNextMsgReq(0xc000681000, 0xc00105f2a8, 0x4503e9, 0x11, {0x0, 0xc000246900}, {0xc0054b6f18, 0x11}, {0xc0002469c4, 0x90, ...}) /home/travis/gopath/src/github.com/nats-io/nats-server/server/consumer.go:2454 +0x8ce fp=0xc00105f250 sp=0xc00105f100 pc=0x77dc2e github.com/nats-io/nats-server/v2/server.(consumer).processNextMsgReq-fm(0x9c, 0x7f302e954fff, 0xc00105f2f8, {0xc000774280, 0x400}, {0xc0054b6f18, 0x40}, {0xc0002469c4, 0x90, 0x63c}) /home/travis/gopath/src/github.com/nats-io/nats-server/server/consumer.go:2380 +0x77 fp=0xc00105f2b8 sp=0xc00105f250 pc=x91e337 github.com/nats-io/nats-server/v2/server.(client).deliverMsg(0xc0015f8000, 0xc003034f00, 0x41642f, {0xc000246969, 0x4b6166, 0x697}, {0xc0002469a9, 0x4b60be, 0x657}, {0xc0015f9480, ...}, ...) /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3180 +0xbb0 fp=0xc00105f530 sp=0xc00105f2b8 pc=0x764470 github.com/nats-io/nats-server/v2/server.(client).processMsgResults(0xc0015f8000, 0x8cd7a5, 0xc0089fb440, {0xc0002469c4, 0x92, 0x63c}, {0x0, 0x0, 0x4}, {0xc000246969, ...}, ...) /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:4163 +0x9af fp=0xc00105fa48 sp=0xc00105f530 pc=0x769e4f github.com/nats-io/nats-server/v2/server.(client).processInboundRoutedMsg(0xc0015f8000, {0xc0002469c4, 0xc0015f8220, 0x63c}) /home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:443 +0x159 fp=0xc00105fae8 sp=0xc00105fa48 pc=0x8ce299 github.com/nats-io/nats-server/v2/server.(client).processInboundMsg(0xc0015f8000, {0xc0002469c4, 0x92, 0x79e}) /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3493 +0x36 fp=0xc00105fb18 sp=0xc00105fae8 pc=0x765c76 github.com/nats-io/nats-server/v2/server.(client).parse(0xc0015f8000, {0xc000246800, 0x800, 0xc087258a5d30c937}) /home/travis/gopath/src/github.com/nats-io/nats-server/server/parser.go:497 +0x246a fp=0xc00105fd98 sp=0xc00105fb18 pc=0x8a4f6a github.com/nats-io/nats-server/v2/server.(client).readLoop(0xc0015f8000, {0x0, 0x0, 0x0})" /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1227 +0xe1f fp=0xc00105ffb0 sp=0xc00105fd98 pc=0x75841f github.com/nats-io/nats-server/v2/server.(Server).createRoute.func1() /home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:1372 +0x25 fp=0xc00105ffe0 sp=0xc00105ffb0 pc=0x8d46a5 runtime.goexit ``` Writting a test showed the data race: ``` ================== WARNING: DATA RACE Read at 0x00c0008ea240 by goroutine 62: runtime.mapaccess1_faststr() /usr/local/go/src/runtime/map_faststr.go:12 +0x0 github.com/nats-io/nats-server/v2/server.(consumer).processNextMsgRequest() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/consumer.go:2567 +0xa64 (...) Previous write at 0x00c0008ea240 by goroutine 15: runtime.mapdelete_faststr() /usr/local/go/src/runtime/map_faststr.go:300 +0x0 github.com/nats-io/nats-server/v2/server.(Account).checkForReverseEntry() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1759 +0x61c github.com/nats-io/nats-server/v2/server.(client).unsubscribe() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:2838 +0xa27 (...) ``` After fixing this data race, another showed up: ``` ================== WARNING: DATA RACE Read at 0x00c000352200 by goroutine 99: github.com/nats-io/nats-server/v2/server.(Account).checkForReverseEntry() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1752 +0x4b3 github.com/nats-io/nats-server/v2/server.(client).unsubscribe() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:2838 +0xa27 (...) Previous write at 0x00c000352200 by goroutine 92: runtime.slicecopy() /usr/local/go/src/runtime/slice.go:284 +0x0 github.com/nats-io/nats-server/v2/server.(Account).checkForReverseEntry() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1737 +0x871 github.com/nats-io/nats-server/v2/server.(Account).removeRespServiceImport() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1622 +0x24c (...) ``` This PR addresses both. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-30 13:51:52 -06:00
Derek Collison	eb16c35016	OrderedConsumer was very conservative with slow start and small max outstanding bytes. This is increasing perf for longer rtt. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-30 05:08:36 -07:00
Derek Collison	bfc1462fb3	Merge pull request #2973 from nats-io/issue-2936 [IMPROVED] Consumer snapshot logic in clustered mode and disk usage.	2022-03-29 18:29:31 -07:00
Derek Collison	607858f213	Improved consumer snapshot logic in clustered mode and disk usage. Also fixed a bug that could cause memory based replicated consumers to no longer work after snapshots and server restarts. The snapshot logic would allow non-state changing updates to continously grow the raft logs. We also were too conservative on when we snapshotted and why. Also added in ability to have FileStore.Compact() reclaim space from the block file from the head of last changed block. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-29 18:02:49 -07:00
Ivan Kozlovic	e1c581334e	[CHANGED] JetStream: lower default consumer's maximum ack pending The default value is lowered from 20,000 to 1,000. This does not seem to have a performance degradation impact, but may help with scalability at scale. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-29 15:30:40 -06:00
Matthias Hanel	1aeaaf0ca3	Adding server limits (max ack pending/dedupe window) to js config (#2967 ) * Adding server limits (max ack pending/dedupe window) to js config Also shifting consumer config check to jsConsumerCreate as in clustered mode this was enforced in the wrong place Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-03-29 13:19:36 -04:00
Matthias Hanel	0c5f3688a7	[ADDED] Tiered limits and fix limit issues on updates (#2945 ) * Adding tiered limits and fix limit issues on updates Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-03-28 20:47:54 -04:00
Ivan Kozlovic	25886e8819	[FIXED] JetStream: sampling not updated during consumer update Resolves #2941 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-28 10:58:58 -06:00
Ivan Kozlovic	5e89374ee9	Fixed another possible lock inversion consumer->stream Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-25 12:21:51 -06:00
Ivan Kozlovic	4739eebfc4	[FIXED] JetStream: possible deadlock during consumer leadership change Would possibly show up when a consumer leader changes for a consumer that had redelivered messages and for instance messages were inbound on the stream. Resolves #2912 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-25 12:21:51 -06:00
Derek Collison	ef8f543ea5	Improve memory usage through JetStream storage layer. Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies. Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them. The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally. Also fixed some flappers and a bug where de-dupe might not be reformed correctly. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-24 17:45:15 -06:00
Ivan Kozlovic	2253bb6f1a	JS: BackOff list caused too frequent checkPending() calls Since the "next" timer value is set to the AckWait value, which is the first element in the BackOff list if present, the check would possibly happen at this interval, even when we were past the first redelivery and the backoff interval had increased. The end-user would still see the redelivery be done at the durations indicated by the BackOff list, but internally, we would be checking at the initial BackOff's ack wait. I added a test that uses the store's interface to detect how many times the checkPending() function is invoked. For this test it should have been invoked twice, but without the fix it was invoked 15 times. Also fixed an unrelated test that could possibly deadlock causing tests to be aborted due to inactivity on Travis. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-23 12:46:17 -06:00
Ivan Kozlovic	c3da392832	Changes to IPQueues Removed the warnings, instead have a sync.Map where they are registered/unregistered and can be inspected with an undocumented monitor page. Added the notion of "in progress" which is the number of messages that have beend pop()'ed. When recycle() is invoked this count goes down. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-17 17:53:06 -06:00
Ivan Kozlovic	fe6d7b305f	Merge pull request #2898 from nats-io/js_cons_ack_processing [CHANGED] JetStream: Redeliveries may be delayed if necessary	2022-03-17 10:57:22 -06:00
Derek Collison	dbfa47f9b1	Improve state preservation for consumers, specifically DeliverNew variants when no activity has been present. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-16 20:55:14 -07:00
Derek Collison	3216eb5ee5	When a consumer has no state we are now compacting the log, but were not snapshotting. This caused issues on leader change and losing quorum. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-09 07:21:25 -05:00
Derek Collison	58da4b917a	Made improvements to scale up and down for streams and consumers. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-06 16:59:02 -08:00
Derek Collison	31a19729b0	When removing a stream peer with an attached durable consumer, the consumer could become inconsistent. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-06 05:42:22 -08:00
Ivan Kozlovic	804ce102ac	[CHANGED] JetStream: Redeliveries may be delayed if necessary We have seen situations where when a lot of pending messages accumulate, there is a contention between the processing of the ACKs and the checking of the pending map. Decision is made to abort checking of pending list if processing of ack(s) would be delayed because of that. The result is that a redelivery may be post-poned. Internally, the ACKs are also now using a queue to prevent processing of them from the network handler, which could cause head-of-line blocking, especially bad for routes. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-03 10:32:35 -07:00
Derek Collison	1c8f7de848	On filtered subjects when consumers were staggered we need to disqualify a filtered consumer if not applicable. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-16 18:24:27 -08:00
Derek Collison	5a93b0e9d8	Allow pull requests to specify a heartbeat when idle to detect when a request is invalidated. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-11 09:51:51 -08:00
Ivan Kozlovic	55ffde7251	Fixed consumer dlv count and num pending wrong due to redeliveries Introduced by #2848, so should not have impacted existing releases. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-02-10 17:33:53 -07:00
Derek Collison	ecfe42630a	Merge pull request #2858 from nats-io/add_consumer_with_info Make sure we snapshot initial consumer info during consumer creation.	2022-02-09 17:05:01 -08:00
Derek Collison	da9046b2e6	Snapshot initial consumer info when needed. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-09 15:23:53 -08:00
Derek Collison	c13a84cf44	Fixed a bug that would calculate the first sequence of a filteredPending incorrectly. Also added in more optimized version to select the first matching message in a message block for LoadNextMsg. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-08 13:29:38 -08:00
Derek Collison	d50febeeff	Improved sparse consumers replay time. When a stream has multiple subjects and a consumer filters the stream to a small and spread out list of messages the logic would do a linear scan looking for the next message for the filtered consumer. This CL allows the store layer to utilize the per subject info to improve the times. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-07 17:26:32 -08:00
Derek Collison	55b7f11c9a	Fixed flow control stall under specific conditions of message size. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-05 20:15:48 -08:00
Ivan Kozlovic	30c431a9a3	[FIXED] JetStream: BackOff redeliveries would always use first in list If the consumer's sequence was not the same than the stream's sequence, then the redelivery would always use the first duration from the BackOff list. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-31 17:44:08 -07:00
Derek Collison	b38ced51b2	A true no wait pull request was not considering redeliveries. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-30 11:52:35 -08:00
Derek Collison	275d42628b	Fix for #2828 . The original design of the consumer and the subsequent store did not allow updates. Now that we do, we need to store the new config into our storage layer. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-30 09:45:05 -08:00
Derek Collison	a57bd96def	Updating a push consumer to be pull would succeed but cause a panic if used. This disallows that upgrade. We had a check in place for pull to push, but not the reverse. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-28 13:11:58 -08:00
Derek Collison	6be9925127	Update config error Signed-off-by: Derek Collison <derek@nats.io>	2022-01-24 15:02:41 -08:00
Derek Collison	65b168aa8b	Updates based on feedback. MaxDeliver needs to be set properly now to be > len(BackOff) but if larger we will reuse last value in BackOff array. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-24 15:02:39 -08:00
Derek Collison	bd78b1a99b	Formal json version for NAK delay Signed-off-by: Derek Collison <derek@nats.io>	2022-01-24 15:01:52 -08:00
Derek Collison	d486c24199	Allow a consumer to be configured with BackOffs. This allows a consumer to have exponential backoffs vs static AckWait and MaxDeliver. When BackOff is set it will overridde AckWait to BackOff[0] and MaxDeliver will be len(BackOff)+1. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-24 14:57:36 -08:00
Derek Collison	579bf336ad	Allow NAK to take a delay parameter to delay redelivery for a certain amount of time. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-24 14:57:28 -08:00
Derek Collison	d332684322	Fixed data race and fuxed bug that we would not clear our waiting queue when a leader stepped down. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-24 13:01:25 -08:00
Derek Collison	6fd41e5ea4	Updates based on review feedback Signed-off-by: Derek Collison <derek@nats.io>	2022-01-24 10:23:47 -08:00
Derek Collison	d962500827	Track reply subjects for pending pull requests across clustered consumers. We will only send if all peers in our group are >= 2.7.1 and we will check for updates. When a consumer follower takes over it will notify all pending requests that those requests are invalid now. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-21 16:31:59 -08:00
Derek Collison	7f572983ac	The 2.7 update broke the one-shot pull consumer fetch behavior due to change and a fix to a bug that allowed it to work before. This change tries to lock down all expected behaviors, and now does out of order timeouts for requests. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-20 18:06:29 -08:00
Ivan Kozlovic	84f6cbb760	Pooling pubMsg and jsPubMsg objects This should help with GC pressure, however, it may have an effect on performance (based on some benchmark). Calling sync.Pool.Get/Put too often has a performance impact... Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:14:25 -07:00
Ivan Kozlovic	d74dba2df9	Replaced RAFT's append entry response channel Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:06:48 -07:00
Ivan Kozlovic	23ebf9d2f8	Adapted jsOutQ Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:05:27 -07:00

1 2 3 4 5 ...

300 Commits