Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.
Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Also added in more optimized version to select the first matching message in a message block for LoadNextMsg.
Signed-off-by: Derek Collison <derek@nats.io>
When a stream has multiple subjects and a consumer filters the stream to a small and spread out list of messages the logic would do a linear scan looking for the next message for the filtered consumer.
This CL allows the store layer to utilize the per subject info to improve the times.
Signed-off-by: Derek Collison <derek@nats.io>
Under the covers we were calculating pending per msg block incorrectly when a single message existed beyond the requested sequence.
Signed-off-by: Derek Collison <derek@nats.io>
Under certain situations large number of consumers that are racing to update state or delete their stores during a delete
would start taking up OS threads due to blocking disk IO. When this happened and their were a bunch of Go routines becoming
runnable the Go runtime would create extra OS threads to fill in the runnable pool and would exhaust the max thread setting.
This code places a channel as a simple semaphore to limit the number of disk IO blocking OS threads.
Signed-off-by: Derek Collison <derek@nats.io>
The filestore would release a msgBlock lock while trying to load a cache block if it thought it needed to flush pending data.
With async false, this should be very rare but was possible after careful inspection.
I constructed an artificial test with sleeps throughout the filestore code to reproduce.
It involved having 2 Go routines that were through and waiting on the last msg block, and another one that was writing.
After the write, but before we flushed after releasing the lock we would also artificially sleep.
This would lead to the second read seeing the cache load was already in progress and return no error.
If the load was for a sequence before the current write sequence, and async was false, the cache fseq would be higher than what was requested.
This would cause the errPartialCache to be returned.
Once returned to the consumer level in loopAndGather, it would exit that Go routine and the consumer would cease to function.
This change removed the unlock of a msgBlock to perform and flush, ensuring that two cacheLoads would not yield the errPartialCache.
I also updated the consumer in the case this does happen in the future to not exit the loopAndGather Go routine.
Signed-off-by: Derek Collison <derek@nats.io>
A low-level Filestore issue would cause a new block to be created
when the last block was empty, but the index for the new block
would not be forced to be written on disk.
The observed issue could be that with a stream with a WorkQueue
retention policy, its first/last sequence values could be reset
after a pull subscriber would have consumed all messages and
the server was restarted without a clean shutdown.
This would cause the pull subscriber to "stall" until enough
new messages are sent to reach a stream sequence that catches
up with the consumer's view of the stream first sequence prior
to the restart.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When a consumer is configured with "meta-only" option, and the
stream was backed by a memory store, a memory corruption could
happen causing the application to receive corrupted headers.
Also replaced most of usage of `append(a[:0:0], a...)` to make
copies. This was based on this wiki:
https://github.com/go101/go101/wiki/How-to-efficiently-clone-a-slice%3F
But since Go 1.15, it is actually faster to call make+copy instead.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Upon server restart a server would set the check expiration to the configured amount vs delta of next to expire.
Signed-off-by: Derek Collison <derek@nats.io>
We were not escaping the top level iterator across message blocks when calculating when to break due to keep > 0.
Signed-off-by: Derek Collison <derek@nats.io>
Replicated durable consumers that were backed by a memory store were bypassing snapshotting which also did compaction of the raft WAL.
This change adapts for memory store backed consumers by compacting the raft WAL directly on snapshot logic.
Signed-off-by: Derek Collison <derek@nats.io>
When we triggered a filestore msg block compact we were not properly dealing with interior deletes.
Subsequent lookups past the skipped messages would cause an error and stop delivering messages.
Signed-off-by: Derek Collison <derek@nats.io>
Under load and pressure from concurrent publishing and consuming with multiple consumers the filestore would
return a partial or no cache error to the upper layers. For consumers this could result in us skipping a stream sequence when we should not.
This change stabilizes the filestore and removes the flush state for msg blocks. I also found some bugs that did not track last sequence properly
after snapshots / restore.
Signed-off-by: Derek Collison <derek@nats.io>
This change introduces utilization, better interior block deletes, and individual block compaction when we are below 50% utilization of the block.
Signed-off-by: Derek Collison <derek@nats.io>