Streams with many interior deletes was causing issues due to the fact that the interior deletes were represented as a sorted []uint64.
This approach introduces 3 sub types of delete blocks, avl bitmask tree, a run length encoding, and the legacy format above.
We also take into account large interior deletes such that on receiving a snapshot we can skip things we already know about.
Signed-off-by: Derek Collison <derek@nats.io>
Optimized and fixed a bug in filestore filteredPending().
Optimized memstore FilteredState().
Added comprehensive tests for NumPending() and FilteredState().
Signed-off-by: Derek Collison <derek@nats.io>
The bug was when a timestamp for the pending state was exactly -1 which could happen based on timing of the redlivered pending items which would set pending.Timestamp into the future potentially and the timing on the encodeConsumerState call.
Minor fixes to raft.
Signed-off-by: Derek Collison <derek@nats.io>
This could happen when a consumer had not sent anything to the
attached NATS subscription and there was a consumer leader
step down or server restart.
Signed-off-by: Derek Collison <derek@nats.io>
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.
The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.
Also fixed some flappers and a bug where de-dupe might not be reformed correctly.
Signed-off-by: Derek Collison <derek@nats.io>
When a stream has multiple subjects and a consumer filters the stream to a small and spread out list of messages the logic would do a linear scan looking for the next message for the filtered consumer.
This CL allows the store layer to utilize the per subject info to improve the times.
Signed-off-by: Derek Collison <derek@nats.io>
There was a bug that would erase the sync subject for upper level catchup for streams.
Raft layer repair was ok but if that was compacted it gets kicked up to the upper layers which would fail.
Users would see "Catchup stalled" messages repeatedly and consumers that had their leaders attached to that replica would also stop working.
Changes were put in to repair the corrupt state after the fact as well, regardless of presence of fix.
Signed-off-by: Derek Collison <derek@nats.io>
Replicated durable consumers that were backed by a memory store were bypassing snapshotting which also did compaction of the raft WAL.
This change adapts for memory store backed consumers by compacting the raft WAL directly on snapshot logic.
Signed-off-by: Derek Collison <derek@nats.io>
This change was made in a previous PR wit this commit:
9405b77e46
After some discussions, we agreed that the original approach
is best, so using a dedicated object SequenceInfo for ConsumerInfo.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This change introduces utilization, better interior block deletes, and individual block compaction when we are below 50% utilization of the block.
Signed-off-by: Derek Collison <derek@nats.io>
1. We were holding open FDs longer than we should for consumers causing issues with open FD limits. We now do not hold them open and cap updates a bit better.
2. When doing a stream delete, consumer delete was repeating alot of work that was not necessary, causing longer delays. This has been optimized a bit, still more improvements to be made.
3. We cover all JS under a single export, but that was also trapping GetNext for pull based consumers, and since this was a no-op (is handled at user account level) we were creating alot of garbage service import responses and reverse map entries that had to be garbage collected. We have a fix in to avoind this but still looking for a better one.
4. Still had some lingering references to all exports vs single JS export.
Signed-off-by: Derek Collison <derek@nats.io>
Some of these are quite generic errors that can happen a lot
in normal circumstances so no need to be too noisy about them
Fixes one missed old style Api Error
Signed-off-by: R.I.Pienaar <rip@devco.net>
Allow wider scoped filtered subjects.
We introduce a per subject information tracking to filestore to optimize for large mux'd streams and more efficient filtered consumers.
Signed-off-by: Derek Collison <derek@nats.io>
We were not properly enforcing server limits. This commit will allow a server to enforce limits but still remain functional even at the JetStream level.
Also fixed a bug for RAFT replay that could cause instability.
Signed-off-by: Derek Collison <derek@nats.io>
This made us add forwarding proposals functionality in the raft layer.
More general cleanup and bug fixes as well.
Signed-off-by: Derek Collison <derek@nats.io>
Underestimated the effort to get stream restore working properly in cluster mode.
Some good bug fixes and stability improvments.
Signed-off-by: Derek Collison <derek@nats.io>
We would release locks and call into upper layers when removing a message. The upper layers may call back into the lower layers to get more information, such as the subject.
This fix has the storage updates optionally supply the subject for filtered consumers and fixes the bug of double deletes.
Signed-off-by: Derek Collison <derek@nats.io>
In preparation for clustering we need to have the consumer filestore update state with deltas vs original design.
Signed-off-by: Derek Collison <derek@nats.io>
The original design had a shared filestore write buffer and individual message blocks had a read cache.
This presented some performance and stability issues when lots of reads and deletes were happening to a
message block that was also being written to actively.
This change eliminates the shared write buffer and uses the message block's cache as a write through as
well as read cache and handles partials correctly.
Signed-off-by: Derek Collison <derek@nats.io>
This will track the stream pending state for each consumer.
This code does account for filtered consumers.
Signed-off-by: Derek Collison <derek@nats.io>