nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-14 10:10:42 -07:00

Author	SHA1	Message	Date
Derek Collison	9748925f13	Improvements to stream and consumer move. During elected stepdown and transfer allow the new leader to take over before we stepdown. We could receive a leader change, so make sure to also check migration state. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-14 07:27:29 -07:00
Ivan Kozlovic	50c3986863	[FIXED] JetStream stream catchup issues - A stream could become leader when it should not, causing messages to be lost. - A catchup could stall because the server sending data could bail out of the runCatchup routine but still send the EOF signal. - Deadlock with monitoring of Jsz Signed-off-by: Ivan Kozlovic <ivan@synadia.com> Signed-off-by: Derek Collison <derek@nats.io> Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-04-12 16:05:12 -06:00
Derek Collison	aa256de55b	Add in Domain to alternates Signed-off-by: Derek Collison <derek@nats.io>	2022-04-11 18:47:19 -06:00
Derek Collison	b7718e2b7a	First pass support for stream alternates Signed-off-by: Derek Collison <derek@nats.io>	2022-04-11 18:47:19 -06:00
Derek Collison	7e38ebcb6e	Allow assets such as streams and their associated consumers to migrate between clusters. The system will allow an update to a stream, and subsequently all attached consumers, to be placed in another cluster either directly or via tag placement. The meta layer will scale the underlying peerset appropriately to straddle the two clusters for both the stream and consumers, taking into account the consumer type. Control will then pass to the current leaders of the assets who will monitor the catchup status of the new peers. (Note we can optimize this later to only traverse once across a GW for any given asset, but for now this is simpler) Once the original leaders have determined the assets are synched it will pass leadership to a member of the new peerset. Once the new leader has been elected, it will forward a request for the meta layer to shrink the peerset by removing the old peers. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-04 18:28:36 -07:00
Ivan Kozlovic	19783a9f11	[CHANGED] Rate limit similar warnings Some warnings, especially when dealing with JS limits that were printed on a per-message basis, are now limited to ~1 per second if the content of the warning is already found in a map. This is also for "client" warnings, but the client porting of the warning is not taken into account so that helps with reducing logging for similar content, but coming from different clients. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-04-01 15:24:03 -06:00
Matthias Hanel	a77f95faa8	error handling and info when moving a stream from non existing tier (#2992 ) adds unit test to test this scenario improves reporting of correct error only show info for non existing tiers where streams exist Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-04-01 14:21:35 -04:00
Derek Collison	7f78d3e618	Not allowing streams to be created meant we could not recover on server restart. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-01 06:41:22 -07:00
Matthias Hanel	1aeaaf0ca3	Adding server limits (max ack pending/dedupe window) to js config (#2967 ) * Adding server limits (max ack pending/dedupe window) to js config Also shifting consumer config check to jsConsumerCreate as in clustered mode this was enforced in the wrong place Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-03-29 13:19:36 -04:00
Matthias Hanel	0c5f3688a7	[ADDED] Tiered limits and fix limit issues on updates (#2945 ) * Adding tiered limits and fix limit issues on updates Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-03-28 20:47:54 -04:00
Derek Collison	6b379329d8	Fix for #2955 . When scaling up a stream with existing messages the existing messages were not being replicated. Also fixed a bug where we were incorrectly not spining up the monitoring loop for a stream when going from 3->1->3. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-26 07:26:46 -07:00
Ivan Kozlovic	4739eebfc4	[FIXED] JetStream: possible deadlock during consumer leadership change Would possibly show up when a consumer leader changes for a consumer that had redelivered messages and for instance messages were inbound on the stream. Resolves #2912 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-25 12:21:51 -06:00
Derek Collison	ef8f543ea5	Improve memory usage through JetStream storage layer. Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies. Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them. The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally. Also fixed some flappers and a bug where de-dupe might not be reformed correctly. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-24 17:45:15 -06:00
Ivan Kozlovic	8d4ff4bc96	Fixed panic on stream create failure (with filestore) This was introduced by the change for ipQueues in #2931. The (*ipQueue).unregister() was written with a protection for the ipQueue to be nil, however, mset.outq is actually not a bare ipQueue but a jsOutQ that embeds a pointer to an ipQueue. So we need to implement register() for jsOutQ. Added a test that reproduced the issue, but found it with a flapping test (TestJetStreamLongStreamNamesAndPubAck) that failed due to a file name too long. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-22 15:21:01 -06:00
Ivan Kozlovic	68da3e8253	[FIXED] Removal of an external source stream Removal of a stream source that was external was not working properly, allowing messages to still flow after the removal and until the server hosting the stream to which the source was removed was restarted. Resolves #2920 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-21 10:59:47 -06:00
Ivan Kozlovic	c3da392832	Changes to IPQueues Removed the warnings, instead have a sync.Map where they are registered/unregistered and can be inspected with an undocumented monitor page. Added the notion of "in progress" which is the number of messages that have beend pop()'ed. When recycle() is invoked this count goes down. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-17 17:53:06 -06:00
Jaime Piña	acfd456758	Prevent reserved bytes underflow (#2907 )	2022-03-16 15:19:35 -07:00
Ivan Kozlovic	b4128693ed	Ensure file path is correct during stream restore Also had to change all references from `path.` to `filepath.` when dealing with files, so that it works properly on Windows. Fixed also lots of tests to defer the shutdown of the server after the removal of the storage, and fixed some config files directories to use the single quote `'` to surround the file path, again to work on Windows. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-09 13:31:51 -07:00
Derek Collison	58da4b917a	Made improvements to scale up and down for streams and consumers. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-06 16:59:02 -08:00
Ivan Kozlovic	196319b106	[FIXED] JetStream: Some stream advisories missing The "deleted" advisory was missing because the stream's send loop was closed before the advisory was pushed to the queue to be sent. Added tests, both for single and clustered mode to test all stream advisories. Resolves #2886 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-06 10:52:42 -07:00
Derek Collison	4b9bc29e53	If we had not heard from a source or mirror we would still calculate the delta since now. This would wrap and create a large number which overflowed JSON's 2^53 limit. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-05 12:46:55 -08:00
Derek Collison	11cad6be6b	In the process of working on #2885 with a user, I was struggling to map $SYS directories to consumer names. This change allows a bit better logging on startup to more easily map a RAFT log directory etc to the stream/consumer. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-03 09:50:00 -08:00
Derek Collison	0cc7302be9	A stream name is tied to its identity and can not be changed on a restore. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-09 12:38:45 -08:00
Derek Collison	12f5ea3655	When a consumer had not filtered subject and was attached to a interest policy retention stream we could incorrectly drop messages. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-01 14:21:05 -08:00
Ivan Kozlovic	84f6cbb760	Pooling pubMsg and jsPubMsg objects This should help with GC pressure, however, it may have an effect on performance (based on some benchmark). Calling sync.Pool.Get/Put too often has a performance impact... Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:14:25 -07:00
Ivan Kozlovic	29c40c874c	Adding logger for IPQueue Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:14:00 -07:00
Ivan Kozlovic	d74dba2df9	Replaced RAFT's append entry response channel Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:06:48 -07:00
Ivan Kozlovic	05c033c46c	Replaced stream's inbound list Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:05:51 -07:00
Ivan Kozlovic	23ebf9d2f8	Adapted jsOutQ Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:05:27 -07:00
Ivan Kozlovic	c377a997e4	Replaced ackMsgQueue Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-01-13 13:03:26 -07:00
Derek Collison	103f710479	Fixed consumer info num pending bug. Under load we could have a message committed to the underlying store when a consumer was being created and then it increase num pending again when the stream signals the consumers. This fix just remembers the last seq of the state when we calculate sgap and test before adding in the stream code. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-12 20:03:26 -08:00
Derek Collison	52da55c8c6	Implement overflow placement for JetStream streams. This allows stream placement to overflow to adjacent clusters. We also do more balanced placement based on resources (store or mem). We can continue to expand this as well. We also introduce an account requirement that stream configs contain a MaxBytes value. We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-06 19:33:08 -08:00
Derek Collison	af4d7dbe52	Memory store tracked interior deletes for stream state, but under KV semantics this could be very large. Actually faster to not track at all and generate on the fly. Saves lots of memory too. When we update the stream state to include runs, etc will update this as well. Signed-off-by: Derek Collison <derek@nats.io>	2021-12-20 17:37:16 -08:00
Derek Collison	ca12a11be3	There were situations where invalid subjects could be assigned to streams. This will patch them on the fly during recovery. Specifically subjects with leading or trailing spaces and mirror streams with any subjects at all. Signed-off-by: Derek Collison <derek@nats.io>	2021-12-01 14:00:23 -07:00
Ivan Kozlovic	1cf8b40304	Merge pull request #2719 from nats-io/js_mem_corruption [FIXED] Corrupted headers receiving from consumer with meta-only	2021-12-01 13:42:47 -07:00
Ivan Kozlovic	9f30bf00e0	[FIXED] Corrupted headers receiving from consumer with meta-only When a consumer is configured with "meta-only" option, and the stream was backed by a memory store, a memory corruption could happen causing the application to receive corrupted headers. Also replaced most of usage of `append(a[:0:0], a...)` to make copies. This was based on this wiki: https://github.com/go101/go101/wiki/How-to-efficiently-clone-a-slice%3F But since Go 1.15, it is actually faster to call make+copy instead. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-12-01 10:50:15 -07:00
R.I.Pienaar	c025d25899	prevent stream update to add subjects to mirrors Signed-off-by: R.I.Pienaar <rip@devco.net>	2021-12-01 18:12:49 +01:00
R.I.Pienaar	cf097bfab4	Merge pull request #2717 from ripienaar/stream_valid_subjects Stream valid subjects	2021-12-01 17:43:41 +01:00
R.I.Pienaar	4f1bfa969f	ensure streams have only valid interest subjects Signed-off-by: R.I.Pienaar <rip@devco.net>	2021-12-01 17:03:28 +01:00
Derek Collison	529095be40	[FIXED #2708 ] Removing a source depending on timing could cause a server panic. Signed-off-by: Derek Collison <derek@nats.io>	2021-11-29 12:48:08 -08:00
Derek Collison	e65f3d4a30	[FIXED #2706 ] - Only utilize full state with deleted details when really needed. Otherwise fast state will suffice. Signed-off-by: Derek Collison <derek@nats.io>	2021-11-29 10:50:28 -08:00
Derek Collison	49c5c873ca	Better handling of stream mismatch scenarios. 1. When a snapshot did not yield actionable data, we were not setting new last sequence if we have to readjust based on snapshot. This could lead to spinning on stream reset for followers. 2. When a stream has lots of failures by design, like KV abstraction, if we cleared the clfs state we would endlessly spin trying to reset the stream. Signed-off-by: Derek Collison <derek@nats.io>	2021-11-18 14:00:41 -08:00
Derek Collison	ae999aabe9	Merge pull request #2669 from nats-io/fix-2658 [FIXED] Duplicate stream create returned wrong response type #2658	2021-11-02 15:39:30 -07:00
Derek Collison	c78d700e90	Fix for #2658 Signed-off-by: Derek Collison <derek@nats.io>	2021-11-02 15:23:15 -07:00
Derek Collison	1af3ab1b4e	Fix for #2666 When encountering errors for sequence mismatches that were benign we were returning an error and not processing the rest of the entries. This would lead to more severe sequence mismatches later on that would cause stream resets. Also added code to deal with server restarts and the clfs fixup states which should have been reset properly. Signed-off-by: Derek Collison <derek@nats.io>	2021-11-02 14:38:22 -07:00
Derek Collison	0f7cdb00e8	Fix for #2633 Signed-off-by: Derek Collison <derek@nats.io>	2021-10-27 15:07:59 -07:00
Derek Collison	d4b0b38a8f	Fix for #2642 There was a bug that would erase the sync subject for upper level catchup for streams. Raft layer repair was ok but if that was compacted it gets kicked up to the upper layers which would fail. Users would see "Catchup stalled" messages repeatedly and consumers that had their leaders attached to that replica would also stop working. Changes were put in to repair the corrupt state after the fact as well, regardless of presence of fix. Signed-off-by: Derek Collison <derek@nats.io>	2021-10-26 20:09:00 -07:00
Derek Collison	678469b40b	Fix for #2644 Signed-off-by: Derek Collison <derek@nats.io>	2021-10-25 13:12:37 -07:00
Ivan Kozlovic	cd23c70ad8	Fixed data race in setSourceConsumer Call to mset.unsubscribe() need to use the version that uses locking when invoked from the subscription callback or from the go routine when the 10secs have elapsed. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-10-12 14:25:46 -06:00
R.I.Pienaar	e73cddc2de	error when a stream requesting rollups deny purge Signed-off-by: R.I.Pienaar <rip@devco.net>	2021-10-07 14:07:10 +02:00

1 2 3 4 5 ...

274 Commits