Commit Graph

82 Commits

Author SHA1 Message Date
Derek Collison
54c5414c3d Detect mal-formed stream state snapshots and return appropriate error
Signed-off-by: Derek Collison <derek@nats.io>
2023-07-30 11:06:06 -07:00
Derek Collison
4d7cd26956 Add in support for segmented binary stream snapshots.
Streams with many interior deletes was causing issues due to the fact that the interior deletes were represented as a sorted []uint64.
This approach introduces 3 sub types of delete blocks, avl bitmask tree, a run length encoding, and the legacy format above.
We also take into account large interior deletes such that on receiving a snapshot we can skip things we already know about.

Signed-off-by: Derek Collison <derek@nats.io>
2023-07-03 08:41:33 -07:00
Maurice van Veen
132567de39 Fix PurgeEx replay with sequence & keep succeeds 2023-06-04 11:56:28 +02:00
Derek Collison
b597485102 Improve performance on storing msgs when multiple subjects exists with multiple messages and we have store limits that are being hit.
Signed-off-by: Derek Collison <derek@nats.io>
2023-04-13 15:27:34 -07:00
Derek Collison
daacbf5580 Added optimized store NumPending() call.
Optimized and fixed a bug in filestore filteredPending().
Optimized memstore FilteredState().

Added comprehensive tests for NumPending() and FilteredState().

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-25 17:26:26 -08:00
Derek Collison
24c2f3b452 Improved performance of subjects details for stream info.
This version avoids all disk IO in the filestore version.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-24 17:22:18 -08:00
Derek Collison
894115b82b Fix for server panic when consumer state was not decoded correctly.
The bug was when a timestamp for the pending state was exactly -1 which could happen based on timing of the redlivered pending items which would set pending.Timestamp into the future potentially and the timing on the encodeConsumerState call.

Minor fixes to raft.

Signed-off-by: Derek Collison <derek@nats.io>
2022-12-06 14:16:20 -08:00
Derek Collison
36ef788112 When determing whether we need an ack, no need to copy since under consumer lock.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-14 11:47:31 -08:00
Derek Collison
cc197771ec Allow compile and staticheck to pass.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-24 09:17:12 -07:00
Ivan Kozlovic
53e3c53d96 [FIXED] JetStream: consumer with deliver new may miss messages
This could happen when a consumer had not sent anything to the
attached NATS subscription and there was a consumer leader
step down or server restart.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-23 12:01:48 -06:00
Derek Collison
41cca8d6c4 Allow proper mix and match of consumer stores and stream stores.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-18 12:51:48 -07:00
Derek Collison
4291433a46 General improvements to accounting for the filestore. This in response to tracking issue #3114.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-12 15:43:11 -07:00
Derek Collison
e7ff38a4ca Add consumerMemStore impl to allow proper replication of state.
Resolves #3006

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:01:13 -07:00
Matthias Hanel
0c5f3688a7 [ADDED] Tiered limits and fix limit issues on updates (#2945)
* Adding tiered limits and fix limit issues on updates

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-28 20:47:54 -04:00
Derek Collison
ef8f543ea5 Improve memory usage through JetStream storage layer.
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.

The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.

Also fixed some flappers and a bug where de-dupe might not be reformed correctly.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-24 17:45:15 -06:00
Derek Collison
d50febeeff Improved sparse consumers replay time.
When a stream has multiple subjects and a consumer filters the stream to a small and spread out list of messages the logic would do a linear scan looking for the next message for the filtered consumer.
This CL allows the store layer to utilize the per subject info to improve the times.

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-07 17:26:32 -08:00
Derek Collison
5da0453964 Add in NumSubjects to StreamInfo
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-02 08:51:13 -08:00
Derek Collison
6a3cf0f71e Added in ability to get number of subjects from StreamInfo, and optionally details per subject on how many messages each subject has.
This can also be filtered, meaning you can filter out the subjects when asking for details.

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-02 08:51:13 -08:00
Derek Collison
275d42628b Fix for #2828. The original design of the consumer and the subsequent store did not allow updates.
Now that we do, we need to store the new config into our storage layer.

Signed-off-by: Derek Collison <derek@nats.io>
2022-01-30 09:45:05 -08:00
Derek Collison
d4b0b38a8f Fix for #2642
There was a bug that would erase the sync subject for upper level catchup for streams.
Raft layer repair was ok but if that was compacted it gets kicked up to the upper layers which would fail.
Users would see "Catchup stalled" messages repeatedly and consumers that had their leaders attached to that replica would also stop working.

Changes were put in to repair the corrupt state after the fact as well, regardless of presence of fix.

Signed-off-by: Derek Collison <derek@nats.io>
2021-10-26 20:09:00 -07:00
Derek Collison
de851e513f Fix for #2548
Replicated durable consumers that were backed by a memory store were bypassing snapshotting which also did compaction of the raft WAL.
This change adapts for memory store backed consumers by compacting the raft WAL directly on snapshot logic.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-21 08:02:11 -07:00
Ivan Kozlovic
1308c73273 [CHANGED] ConsumerInfo's SequencePair replaced with SequenceInfo
This change was made in a previous PR wit this commit:
9405b77e46

After some discussions, we agreed that the original approach
is best, so using a dedicated object SequenceInfo for ConsumerInfo.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-08-23 12:28:23 -06:00
Derek Collison
d349edeeb6 When a JetStream stream was used as a KV, there could be times where we have lots of file storage unused.
This change introduces utilization, better interior block deletes, and individual block compaction when we are below 50% utilization of the block.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-19 18:24:41 -07:00
Derek Collison
9d7123213a Keep SequencePair vs SequenceInfo
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-14 12:01:29 -07:00
Derek Collison
9b0158daf9 Allow delivery policy of DeliverLastPerSubject, which is helpful for scoped watchers for K/V.
Signed-off-by: Derek Collison <derek@nats.io>
2021-07-28 12:49:02 -07:00
Derek Collison
99fed910f0 Improvements to large numbers of JetStream R1 consumers per stream.
1. We were holding open FDs longer than we should for consumers causing issues with open FD limits. We now do not hold them open and cap updates a bit better.

2. When doing a stream delete, consumer delete was repeating alot of work that was not necessary, causing longer delays. This has been optimized a bit, still more improvements to be made.

3. We cover all JS under a single export, but that was also trapping GetNext for pull based consumers, and since this was a no-op (is handled at user account level) we were creating alot of garbage service import responses and reverse map entries that had to be garbage collected. We have a fix in to avoind this but still looking for a better one.

4. Still had some lingering references to all exports vs single JS export.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-29 05:45:55 -07:00
R.I.Pienaar
0d71d35e43 do not log at Error level for some store failures
Some of these are quite generic errors that can happen a lot
in normal circumstances so no need to be too noisy about them

Fixes one missed old style Api Error

Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-06-28 10:18:16 +02:00
Derek Collison
c0e47966ab Added in Stream get last message by subject.
This is to aid in K/V overlay for simple Get ops vs creating a full blown consumer.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-24 13:21:39 -07:00
Derek Collison
9398c3ca28 Allow for more advanced purge operations that filter by subject, specify the sequence or number of messages to keep.
Signed-off-by: Derek Collison <derek@nats.io>
2021-06-19 07:04:44 -07:00
Derek Collison
d0ac1a40ca Added in per subject limits for streams.
Signed-off-by: Derek Collison <derek@nats.io>
2021-06-15 06:36:34 -07:00
Derek Collison
08cdb2d2ea Make filtered consumers in large mixed streams more efficient.
Allow wider scoped filtered subjects.

We introduce a per subject information tracking to filestore to optimize for large mux'd streams and more efficient filtered consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-15 04:44:05 -07:00
Derek Collison
90989d57d6 Change to report total deleted by default for stream info.
Allow deleted details if requested.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-12 18:10:11 -07:00
Derek Collison
c0e8590c0f During startup each filtered consumer could do a linear scan of the stream
to determine number of messages pending. This improves that with a startup cache.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-07 09:15:01 -07:00
Derek Collison
e53caee5e8 Enforce server limits even when dynamic limits for accounts in play.
We were not properly enforcing server limits. This commit will allow a server to enforce limits but still remain functional even at the JetStream level.
Also fixed a bug for RAFT replay that could cause instability.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-25 16:06:27 -07:00
Waldemar Quevedo
86a64fbc46 Updates to JS consumer errors
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2021-03-09 09:46:28 -08:00
Derek Collison
d31fda5dac Added code to constrain size of WAL under most scenarios.
Signed-off-by: Derek Collison <derek@nats.io>
2021-03-06 08:38:56 -08:00
Derek Collison
e5c8774172 Handle out of space situations, general stability enhancements
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-25 17:54:29 -08:00
Derek Collison
c16f6e193d Move JetStream direct APIs to private.
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-07 15:19:22 -08:00
Derek Collison
b358773ddf Force filestore to flush in place by default.
Track lost data and truncate message blocks when detecting failures or write errors.

Signed-off-by: Derek Collison <derek@nats.io>
2021-02-06 20:04:47 -08:00
Derek Collison
9c858d197a Added ability to properly restore consumers from a snapshot.
This made us add forwarding proposals functionality in the raft layer.
More general cleanup and bug fixes as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-01-24 19:30:34 -08:00
Derek Collison
ff54c9dc9c Reworked snapshot and restore.
Underestimated the effort to get stream restore working properly in cluster mode.
Some good bug fixes and stability improvments.

Signed-off-by: Derek Collison <derek@nats.io>
2021-01-20 11:58:31 -08:00
Derek Collison
f0cdf89c61 JetStream Clustering WIP
Signed-off-by: Derek Collison <derek@nats.io>
2021-01-14 01:14:52 -08:00
Derek Collison
7564768027 Added Compact to store interface for WAL functionality
Signed-off-by: Derek Collison <derek@nats.io>
2020-12-03 16:18:58 -08:00
Derek Collison
28cb4e8c34 Fix bug when removing the same message from a stream.
We would release locks and call into upper layers when removing a message. The upper layers may call back into the lower layers to get more information, such as the subject.
This fix has the storage updates optionally supply the subject for filtered consumers and fixes the bug of double deletes.

Signed-off-by: Derek Collison <derek@nats.io>
2020-11-13 17:05:24 -08:00
Derek Collison
164f9fdf2b Updates to consumer store to support delta updates.
In preparation for clustering we need to have the consumer filestore update state with deltas vs original design.

Signed-off-by: Derek Collison <derek@nats.io>
2020-11-10 19:16:55 -08:00
Derek Collison
fe2b354414 Stability and performance updates.
The original design had a shared filestore write buffer and individual message blocks had a read cache.
This presented some performance and stability issues when lots of reads and deletes were happening to a
message block that was also being written to actively.

This change eliminates the shared write buffer and uses the message block's cache as a write through as
well as read cache and handles partials correctly.

Signed-off-by: Derek Collison <derek@nats.io>
2020-10-28 07:58:47 -07:00
Derek Collison
df4ee081a5 Track number of stream pending for each consumer.
This will track the stream pending state for each consumer.
This code does account for filtered consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2020-10-27 19:30:42 -07:00
Derek Collison
ad247d1853 Add store.SkipMsg() and update no interest retention streams
Signed-off-by: Derek Collison <derek@nats.io>
2020-10-22 19:35:28 -07:00
Derek Collison
e584d4efee Merge pull request #1435 from nats-io/js-hdrs
First pass header support for JetStream
2020-05-31 06:01:01 -07:00
Derek Collison
eca04c6fce First pass header support for JetStream
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-30 10:04:23 -07:00