Replicated durable consumers that were backed by a memory store were bypassing snapshotting which also did compaction of the raft WAL.
This change adapts for memory store backed consumers by compacting the raft WAL directly on snapshot logic.
Signed-off-by: Derek Collison <derek@nats.io>
When we stored a message in the raft layer in a wrong position (state corrupt), we would panic, leaving the message there.
On restart we would truncate the WAL and try to repair, but we truncated to the wrong index of the bad entry.
This change also includes additional changes to truncateWAL and also reduces the conditional for panic on storeMsg.
Signed-off-by: Derek Collison <derek@nats.io>
Under load and pressure from concurrent publishing and consuming with multiple consumers the filestore would
return a partial or no cache error to the upper layers. For consumers this could result in us skipping a stream sequence when we should not.
This change stabilizes the filestore and removes the flush state for msg blocks. I also found some bugs that did not track last sequence properly
after snapshots / restore.
Signed-off-by: Derek Collison <derek@nats.io>
This change introduces utilization, better interior block deletes, and individual block compaction when we are below 50% utilization of the block.
Signed-off-by: Derek Collison <derek@nats.io>
Added in client kind and sub type for clients.
Added in ability to filter connections based on matching subject interest.
Signed-off-by: Derek Collison <derek@nats.io>
During normal operation and quick restarts the number of expired messages per cycle is manageable and correct.
However if a server is shutdown for quite a long time and many messages have expired this process is too slow.
This commit introduces an optimized expiration tailored for startup vs running state.
Signed-off-by: Derek Collison <derek@nats.io>
When processing service imports we would swap out the accounts during processing.
With the addition of internal subscriptions and internal clients publishing in JetStream we had an issue with the wrong account being used.
This was specific to delyaed pull subscribers trying to unsubscribe due to max of 1 while other JetStream API calls were running concurrently.
We optimized the filtered purge to skip msgBlks that are not in play.
Also optimized msgBlock buffer usage by using two sync.Pools to enhance reuse.
Signed-off-by: Derek Collison <derek@nats.io>
Allow wider scoped filtered subjects.
We introduce a per subject information tracking to filestore to optimize for large mux'd streams and more efficient filtered consumers.
Signed-off-by: Derek Collison <derek@nats.io>
Currently in tests, we have calls to os.Remove and os.RemoveAll where we
don't check the returned error. This hides useful error messages when
tests fail to run, such as "too many open files".
This change checks for more filesystem related errors and calls t.Fatal
if there is an error.
When a new leader is elected it has to give everyone a chance to reply,
so that we can observe rejections with higher term.
The maximum election timeout is 7.5 seconds.
The new behavior of waiting for the election timeout caused unit tests
to fail. Hence upping the timeout there as well.
Signed-off-by: Matthias Hanel <mh@synadia.com>
- Fixed the close of a TLS connection which starting Go 1.16
set the deadline to 5 seconds.
- Fixed an issue with setHeader that was causing these error messages
```
=== RUN TestServiceImportReplyMatchCycleMultiHops
nats: message could not decode headers on connection [4] for subscription on "foo"
--- PASS: TestServiceImportReplyMatchCycleMultiHops (0.04s)
```
- Fixed names of tests in norace_test.go since they must start with
TestNoRace in order to make sure that we execute them in Travis:
```
go test -v -run=TestNoRace --failfast -p=1 ./...
```
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Add in proper support for stream updates in clustered mode.
Don't send API updates without subjects, caused GW parser errors.
Stream internal loops use their own clients now.
Signed-off-by: Derek Collison <derek@nats.io>