Would possibly show up when a consumer leader changes for a consumer
that had redelivered messages and for instance messages were inbound
on the stream.
Resolves#2912
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.
The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.
Also fixed some flappers and a bug where de-dupe might not be reformed correctly.
Signed-off-by: Derek Collison <derek@nats.io>
Since the "next" timer value is set to the AckWait value, which
is the first element in the BackOff list if present, the check
would possibly happen at this interval, even when we were past
the first redelivery and the backoff interval had increased.
The end-user would still see the redelivery be done at the durations
indicated by the BackOff list, but internally, we would be checking
at the initial BackOff's ack wait.
I added a test that uses the store's interface to detect how many
times the checkPending() function is invoked. For this test it
should have been invoked twice, but without the fix it was invoked
15 times.
Also fixed an unrelated test that could possibly deadlock causing
tests to be aborted due to inactivity on Travis.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced by the change for ipQueues in #2931.
The (*ipQueue).unregister() was written with a protection for
the ipQueue to be nil, however, mset.outq is actually not a bare
ipQueue but a jsOutQ that embeds a pointer to an ipQueue. So we
need to implement register() for jsOutQ.
Added a test that reproduced the issue, but found it with a flapping
test (TestJetStreamLongStreamNamesAndPubAck) that failed due to
a file name too long.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If accounts{} block is specified, authorization{} should not have
any user/password/token or users array defined.
The reason is that users parsed in accounts{} are associated with
their respective account but users parsed in authorization{} are
associated with the global account. If the same user name is
in both, and since internally the parsing of those 2 blocks is
completely random (even if layed out in the config in a specific
order), the outcome may be that a user is either associated with
an account or the default global account.
To minimize breaking changes, but still avoid this unexpected
outcome, the server will now detect if there are duplicate users
(or nkeys) inside authorization{} block itself, but also between
this block and accounts{}.
The check will also detect if accounts{} has any user/nkey, then
the authorization{} block should not have any user/password/token,
making this test similar to the check we had in authorization{}
block itself.
Resolves#2926
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
I got this panic in a test:
```
=== RUN TestJetStreamClusterAccountLoadFailure
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0xb1501b]
goroutine 47853 [running]:
github.com/nats-io/nats-server/v2/server.(*jetStream).processLeaderChange(0xc000b60580, 0x0)
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:3638 +0x9b
github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster(0xc000b60580)
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:853 +0x60f
created by github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine
/home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3017 +0x87
FAIL github.com/nats-io/nats-server/v2/server 227.888s
```
which from that branch would point to function processLeaderChange()
line:
```
} else if node := js.getMetaGroup().GroupLeader(); node == _EMPTY_ {
```
which I guess meant that getMetaGroup() was returning `nil`.
Refactored a bit to get the group leader in 2 steps.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removal of a stream source that was external was not working properly,
allowing messages to still flow after the removal and until the
server hosting the stream to which the source was removed was
restarted.
Resolves#2920
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
For reason explained in previous commit, for tests that were
expecting the number of ack/pending to be of a certain value after
an Ack(), they would be flapping. Replaced all references and
we can go back to selectively call Ack() when AckSync() is not
needed.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Since acks are now processed in different go-routine, the tests
that use Ack() cannot expect the number of ack messages to be
exact immediately. So in this test use AckSync() to ensure that
the ack is processed. Alternatively, the pending count should
be checked with a checkFor().
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This adds a test to see if we can update a stream when the stream limit
is 1. Currently, this test fails, so we're skipping it. This test will
be enabled in a future PR.
We also would hang if no stream info requests were sent during a stream list due to the asset being offline.
Signed-off-by: Derek Collison <derek@nats.io>
Gateway connection will be closed and error reported if a remote
has a name that is a duplicate of the local cluster.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.
Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>