nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-17 03:24:40 -07:00

Author	SHA1	Message	Date
Derek Collison	35135948a0	Make sure llts update under lock, fss can be force expired so remove. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-17 14:54:35 -07:00
Derek Collison	d48ccf4c5a	When filestore is used for raft layer do not attempt to track subject metadata. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-17 13:46:13 -07:00
Derek Collison	9508276b98	Make kek function based on review feedback Signed-off-by: Derek Collison <derek@nats.io>	2022-08-16 12:49:03 -07:00
Derek Collison	ef91d67708	Support auto-conversion Signed-off-by: Derek Collison <derek@nats.io>	2022-08-16 08:41:39 -07:00
Derek Collison	827b34a77a	Add support for AES cipher encryption for filestore. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-15 14:21:37 -07:00
Derek Collison	d7534dff5f	Make sure when SubjectState is called we have loaded fss state. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-12 07:14:39 -05:00
Derek Collison	6bc82bb4e6	Fic a data race Signed-off-by: Derek Collison <derek@nats.io>	2022-08-09 17:42:02 -05:00
Derek Collison	8c04adc009	Improvements to filestore for large KVs. Use better indexing for lookups, we used to do simple linear scan backwards, now track first and last block. Will expire the fss cache at will to reduce memory usage. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-09 15:51:13 -05:00
Ivan Kozlovic	3c9a7cc6e5	Move to Go 1.19, remote io/util, fix data race and a flapper Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-05 09:55:37 -06:00
Derek Collison	6450301cfc	Improved speed of storing new messages when lots of messages are present in KV mode. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-01 14:02:19 -07:00
Derek Collison	717969510d	Make sure to reset block encryption counter when clearing block but holding state for tracking sequences. Signed-off-by: Derek Collison <derek@nats.io>	2022-07-31 07:59:19 -07:00
Derek Collison	8dc1e4b6de	When compact would reclaim head of block space, we needed to update block key for counter for new writes. Signed-off-by: Derek Collison <derek@nats.io>	2022-07-30 13:05:41 -07:00
Ivan Kozlovic	d0ee9a1252	Add comment that this is to silence the race detector. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-07-27 15:43:38 -06:00
Ivan Kozlovic	880e854637	Fixed data race between UpdateConfif() and subjString A message block is checking the filestore's cfg.Subjects to see if it can "intern" the subject or not. The problem is that this is done under the message block's lock, but not the filestore. However, during a stream configuration update, the filestore's cfg field is switched to a new one, causing the datarace. By making sure we do the switch under all message blocks lock, we remove the data race (that could be reproduce by running th test TestJetStreamClusterMoveCancel with -count=10). We investigating the use of a string interning library but it showed a little performance degradation that this approach does not suffer from. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-07-27 14:39:34 -06:00
Derek Collison	52f7765322	When msgs were expired on restart recovery we could lose track on subsequent restart of starting sequence with no additional activity. Signed-off-by: Derek Collison <derek@nats.io>	2022-07-23 17:15:16 -07:00
Derek Collison	3a10456e68	Short index write could lead to loss of stream sequence for empty stream Signed-off-by: Derek Collison <derek@nats.io>	2022-07-22 06:37:19 -07:00
Derek Collison	148877b2f0	In the presence of many subjects in a stream and a wildcard filter subject, fall back to linear scan if too many. Signed-off-by: Derek Collison <derek@nats.io>	2022-06-11 10:12:40 -07:00
Derek Collison	e1c8f9fb55	This improves when a server is under load or low on resources like FDs and a user is trying to delete a stream with lots of consumers. Signed-off-by: Derek Collison <derek@nats.io>	2022-06-04 16:49:17 -07:00
Derek Collison	6cc14ff84d	When stores and load for last for subject where concurrent and competiting for the same msg, we could fail to retrieve a newly placed message. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-31 18:04:05 -07:00
Ivan Kozlovic	53e3c53d96	[FIXED] JetStream: consumer with deliver new may miss messages This could happen when a consumer had not sent anything to the attached NATS subscription and there was a consumer leader step down or server restart. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-23 12:01:48 -06:00
Derek Collison	41cca8d6c4	Allow proper mix and match of consumer stores and stream stores. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-18 12:51:48 -07:00
Derek Collison	4291433a46	General improvements to accounting for the filestore. This in response to tracking issue #3114 . Signed-off-by: Derek Collison <derek@nats.io>	2022-05-12 15:43:11 -07:00
Ivan Kozlovic	e304589da4	[FIXED] JetStream: Some data races We were getting a data race checking the js.clustered field in updateUsage() following fix for lock inversion in PR #3092. ``` === RUN TestJetStreamClusterKVMultipleConcurrentCreate ================== WARNING: DATA RACE Read at 0x00c0009db5d8 by goroutine 195: github.com/nats-io/nats-server/v2/server.(jsAccount).updateUsage() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:1681 +0x8f github.com/nats-io/nats-server/v2/server.(stream).storeUpdates() /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2927 +0x1d9 github.com/nats-io/nats-server/v2/server.(stream).storeUpdates-fm() /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2905 +0x7d github.com/nats-io/nats-server/v2/server.(fileStore).removeMsg() /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2158 +0x14f7 github.com/nats-io/nats-server/v2/server.(fileStore).expireMsgs() /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x18f github.com/nats-io/nats-server/v2/server.(fileStore).expireMsgs-fm() /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2770 +0x39 Previous write at 0x00c0009db5d8 by goroutine 128: github.com/nats-io/nats-server/v2/server.(jetStream).setupMetaGroup() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:604 +0xfae github.com/nats-io/nats-server/v2/server.(Server).enableJetStreamClustering() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a github.com/nats-io/nats-server/v2/server.(Server).enableJetStream() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168 github.com/nats-io/nats-server/v2/server.(Server).EnableJetStream() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651 github.com/nats-io/nats-server/v2/server.(Server).Start() /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804 github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4269() /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39 Goroutine 195 (running) created at: time.goFunc() /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/time/sleep.go:180 +0x49 Goroutine 128 (finished) created at: github.com/nats-io/nats-server/v2/server.RunServer() /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x278 github.com/nats-io/nats-server/v2/server.RunServerWithConfig() /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:112 +0x44 github.com/nats-io/nats-server/v2/server.(cluster).restartServer() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_helpers_test.go:1004 +0x1d5 github.com/nats-io/nats-server/v2/server.TestJetStreamClusterKVMultipleConcurrentCreate() /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster_test.go:8463 +0x64b testing.tRunner() /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1259 +0x22f testing.(T).Run·dwrap·21() /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1306 +0x47 ================== ``` Running that test with adding some delay in several places also showed another race: ``` ================== WARNING: DATA RACE Read at 0x00c00016adb8 by goroutine 160: github.com/nats-io/nats-server/v2/server.(fileStore).expireMsgs() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x106 github.com/nats-io/nats-server/v2/server.(fileStore).expireMsgs-fm() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2771 +0x39 Previous write at 0x00c00016adb8 by goroutine 32: github.com/nats-io/nats-server/v2/server.(fileStore).UpdateConfig() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:360 +0x1c8 github.com/nats-io/nats-server/v2/server.(stream).update() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/stream.go:1360 +0x852 github.com/nats-io/nats-server/v2/server.(jetStream).processClusterCreateStream() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2704 +0x4a4 github.com/nats-io/nats-server/v2/server.(jetStream).processStreamAssignment() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2452 +0xad9 github.com/nats-io/nats-server/v2/server.(jetStream).applyMetaEntries() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:1407 +0x7e4 github.com/nats-io/nats-server/v2/server.(jetStream).monitorCluster() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:887 +0xc75 github.com/nats-io/nats-server/v2/server.(jetStream).monitorCluster-fm() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:813 +0x39 Goroutine 160 (running) created at: time.goFunc() /usr/local/go/src/time/sleep.go:180 +0x49 Goroutine 32 (running) created at: github.com/nats-io/nats-server/v2/server.(Server).startGoRoutine() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:3013 +0x86 github.com/nats-io/nats-server/v2/server.(jetStream).setupMetaGroup() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:612 +0x1092 github.com/nats-io/nats-server/v2/server.(Server).enableJetStreamClustering() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a github.com/nats-io/nats-server/v2/server.(Server).enableJetStream() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168 github.com/nats-io/nats-server/v2/server.(Server).EnableJetStream() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651 github.com/nats-io/nats-server/v2/server.(Server).Start() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804 github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4275() /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39 ================== ``` Both are now addressed, either with proper locking, or with the use of an atomic in the place where we cannot get the lock (without re-introducing the lock inversion issue). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-05-11 19:09:24 -06:00
Derek Collison	88ebfdaee8	Merge pull request #3109 from nats-io/issue-3107-3069 [FIXED] Downstream sourced retention policy streams during restart have redelivered messages	2022-05-09 09:13:48 -07:00
Derek Collison	b35988adf9	Remember the last timestamp by not removing last msgBlk when empty and during purge pull last timestamp forward until new messages arrive. When a downstream stream uses retention modes that delete messages, fallback to timebased start time for the new source consumers. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-09 09:04:19 -07:00
Derek Collison	fbc9e16253	Fix for panic due to not loaded cache during compact Signed-off-by: Derek Collison <derek@nats.io>	2022-05-07 09:25:32 -07:00
Derek Collison	7246edc77d	Bump up default block sizes Signed-off-by: Derek Collison <derek@nats.io>	2022-05-04 09:46:15 -07:00
Ivan Kozlovic	5d90c8eac7	[IMPROVED] JetStream: check max-per-subject once There was a case where we may have done a check for max-per-subject limit twice per message. That would apply to streams that have max-per-subject and also discard_new, which is what KV configures. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-05-03 16:57:26 -06:00
Derek Collison	0d928c0338	Merge pull request #3085 from nats-io/small-fss-improvement Small improvement with fss processing	2022-04-28 13:56:53 -07:00
Ivan Kozlovic	d4d37e67f4	[FIXED] JetStream: file store compact and when to write index When deciding to compact a file, we need to remove from the raw bytes the empty records, otherwise, for small messages, we would end-up calling compact() too many times. When removing a message from the stream, in FIFO cases we would write the index every 2 seconds at most when doing it in place, when when dealing with out of order deletes, we would do it for every single delete, which can be costly. We are now writing only every 500ms for non FIFO cases. Also fixed some unrelated code: - Decision to install a snapshot was based on incorrect logical expression - In checkPending(), protect against the timer being nil which could happen when consumer is stopped or leadership change. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-04-28 12:35:19 -06:00
Derek Collison	9a96bef4c7	Small improvement with fss processing Signed-off-by: Derek Collison <derek@nats.io>	2022-04-28 10:23:30 -07:00
Derek Collison	f702e279ab	Fix for a consumer recovery issue. Also update healthz to check all assets that are assigned, not just running. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-26 19:22:19 -07:00
Ivan Kozlovic	50c3986863	[FIXED] JetStream stream catchup issues - A stream could become leader when it should not, causing messages to be lost. - A catchup could stall because the server sending data could bail out of the runCatchup routine but still send the EOF signal. - Deadlock with monitoring of Jsz Signed-off-by: Ivan Kozlovic <ivan@synadia.com> Signed-off-by: Derek Collison <derek@nats.io> Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-04-12 16:05:12 -06:00
Derek Collison	e7ff38a4ca	Add consumerMemStore impl to allow proper replication of state. Resolves #3006 Signed-off-by: Derek Collison <derek@nats.io>	2022-04-10 08:01:13 -07:00
Derek Collison	ef9728997d	During recovery check our guess on the last block. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-05 19:20:31 -07:00
Derek Collison	ab5e2344e0	When loading blocks in use len(mb.fss) to determine if we can use sfilter optimization. Also check fs.lmb when the stream config is updated. Signed-off-by: Derek Collison <derek@nats.io>	2022-04-05 18:49:21 -07:00
Ivan Kozlovic	371ce36712	[IMPROVED] Stream with multiple subjects and consumer with filter This is more of a regression introduced in v2.7.3 (with PR #2848). When the store has a list of subjects, finding the next message to deliver would go through the subjects map and have to match to find out if it is a subset (if the filter had a wildcard). In situations where there were lots of subjects (for instance 1 message per subject), but the consumer did not filter on anything specific, then this processing was becoming slow. We now check that if the stream has a single subject (even with wildcard) and the consumer filters on that exact subject, then we can do a linear scan. We also do a linear scan if the number of messages in the block is 1/2 the number of subjects in the subjects map. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-04-05 18:19:17 -06:00
Derek Collison	607858f213	Improved consumer snapshot logic in clustered mode and disk usage. Also fixed a bug that could cause memory based replicated consumers to no longer work after snapshots and server restarts. The snapshot logic would allow non-state changing updates to continously grow the raft logs. We also were too conservative on when we snapshotted and why. Also added in ability to have FileStore.Compact() reclaim space from the block file from the head of last changed block. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-29 18:02:49 -07:00
Derek Collison	780d4c0dd8	Merge pull request #2960 from nats-io/mem_pool Additional improvements to memory pooling and management.	2022-03-28 17:10:16 -07:00
Derek Collison	bd0a0b28c7	When recycling blocks we could potentially place partials into a tier. This would possibly cause the load code to thrash since it would not be big enough for a full block and we would need to recycle again and make a new one. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-28 16:46:46 -07:00
Ivan Kozlovic	f82eda30aa	Fix map init Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-28 17:46:01 -06:00
Ivan Kozlovic	909c6754cb	Changed subjString to accept a byte slice This may prevent memory copies when not necessary. Also fixed a bug there that would check twice if there was only 1 subject and that subject did not match (say configured subject is foo.* and key is foo.bar). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-28 17:37:28 -06:00
Derek Collison	5e5aab378e	Additional improvements to memory pooling and management. Also logic fix for firstMatching that did unnecessary work when matching all. During contention to the head write blk, the system could perform worse memory wise compared to simple go runtime. Also had some references for the subject of messages bloating memory. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-28 10:15:23 -07:00
Derek Collison	04d4f08e8c	Under heavy contention skip combined with remove could result in index being stamped with underflow for number of messages. We had a report of a panic on server restart with 2.8.0-beta.1. The panic was trying to malloc the size of a load block based off of the number of messages we thought the block had from the index. Before, SkipMsg would decrement and when we added the record via writeMsgRecord we would add it back in. However we did release the lock, meaning other things could run. If in between the decrement, say to 0 (we did protect against underflow there), then a remove and subsequent writeIndexInfo would stamp and underflow. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-26 11:05:38 -07:00
Derek Collison	ef8f543ea5	Improve memory usage through JetStream storage layer. Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies. Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them. The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally. Also fixed some flappers and a bug where de-dupe might not be reformed correctly. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-24 17:45:15 -06:00
Derek Collison	dbfa47f9b1	Improve state preservation for consumers, specifically DeliverNew variants when no activity has been present. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-16 20:55:14 -07:00
Ivan Kozlovic	b4128693ed	Ensure file path is correct during stream restore Also had to change all references from `path.` to `filepath.` when dealing with files, so that it works properly on Windows. Fixed also lots of tests to defer the shutdown of the server after the removal of the storage, and fixed some config files directories to use the single quote `'` to surround the file path, again to work on Windows. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-09 13:31:51 -07:00
Derek Collison	3216eb5ee5	When a consumer has no state we are now compacting the log, but were not snapshotting. This caused issues on leader change and losing quorum. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-09 07:21:25 -05:00
Derek Collison	b759ff481f	Some users reporting checksums don't match and "no message cache" on recovery. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-04 11:50:15 -08:00
Derek Collison	1b5f651c22	Fixed bug that would not recover a stream after non-clean shutdown with deleted messages. Signed-off-by: Derek Collison <derek@nats.io>	2022-03-04 10:48:10 -08:00

1 2 3 4 5

246 Commits