This may prevent memory copies when not necessary. Also fixed a bug
there that would check twice if there was only 1 subject and that
subject did not match (say configured subject is foo.* and key is
foo.bar).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
During contention to the head write blk, the system could perform worse memory wise compared to simple go runtime.
Also had some references for the subject of messages bloating memory.
Signed-off-by: Derek Collison <derek@nats.io>
We had a report of a panic on server restart with 2.8.0-beta.1. The panic was trying to malloc the size of a load block based off of the number of messages we thought the block had from the index.
Before, SkipMsg would decrement and when we added the record via writeMsgRecord we would add it back in. However we did release the lock, meaning other things could run.
If in between the decrement, say to 0 (we did protect against underflow there), then a remove and subsequent writeIndexInfo would stamp and underflow.
Signed-off-by: Derek Collison <derek@nats.io>
Also fixed a bug where we were incorrectly not spining up the monitoring loop for a stream when going from 3->1->3.
Signed-off-by: Derek Collison <derek@nats.io>
startGoRoutine will execute the closed function as a go routine,
so passing copyBytes(msg) as the argument caused a race. The
copy needs to be done before startGoRoutine, as it was before
being changed in https://github.com/nats-io/nats-server/pull/2925
Here is the race observed:
```
==================
WARNING: DATA RACE
Write at 0x00c0001dd930 by goroutine 367:
runtime.racewriterange()
<autogenerated>:1 +0x29
internal/poll.ignoringEINTRIO()
/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/internal/poll/fd_unix.go:582 +0x454
internal/poll.(*FD).Read()
/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/internal/poll/fd_unix.go:163 +0x26
net.(*netFD).Read()
/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/net/fd_posix.go:56 +0x50
net.(*conn).Read()
/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/net/net.go:183 +0xb0
net.(*TCPConn).Read()
<autogenerated>:1 +0x64
github.com/nats-io/nats-server/v2/server.(*client).readLoop()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1188 +0x8f7
github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func1()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:904 +0x5d
Previous read at 0x00c0001dd930 by goroutine 93:
runtime.slicecopy()
/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/slice.go:284 +0x0
github.com/nats-io/nats-server/v2/server.copyBytes()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/util.go:282 +0x10b
github.com/nats-io/nats-server/v2/server.(*Server).jsStreamListRequest.func1()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1613 +0x26
Goroutine 367 (running) created at:
github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3017 +0x86
github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:904 +0x1b08
github.com/nats-io/nats-server/v2/server.(*Server).startLeafNodeAcceptLoop.func1()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:604 +0x4b
github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections.func1()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:2122 +0x58
Goroutine 93 (running) created at:
github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3017 +0x86
github.com/nats-io/nats-server/v2/server.(*Server).jsStreamListRequest()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1613 +0xbf1
github.com/nats-io/nats-server/v2/server.(*Server).jsStreamListRequest-fm()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1554 +0xcc
github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:680 +0xcf0
github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch-fm()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:652 +0xcc
github.com/nats-io/nats-server/v2/server.(*client).deliverMsg()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3181 +0xbde
github.com/nats-io/nats-server/v2/server.(*client).processMsgResults()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:4164 +0xe1e
github.com/nats-io/nats-server/v2/server.(*client).processInboundLeafMsg()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:2183 +0x7eb
github.com/nats-io/nats-server/v2/server.(*client).processInboundMsg()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3498 +0xb1
github.com/nats-io/nats-server/v2/server.(*client).parse()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/parser.go:497 +0x3886
github.com/nats-io/nats-server/v2/server.(*client).readLoop()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1228 +0x1669
github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func1()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:904 +0x5d
==================
testing.go:1152: race detected during execution of test
```
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Would possibly show up when a consumer leader changes for a consumer
that had redelivered messages and for instance messages were inbound
on the stream.
Resolves#2912
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The data from other nodes are usually wrong, this can be quite
confusing for users so we now only send it when we are the leader
Signed-off-by: R.I.Pienaar <rip@devco.net>
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.
The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.
Also fixed some flappers and a bug where de-dupe might not be reformed correctly.
Signed-off-by: Derek Collison <derek@nats.io>
Since the "next" timer value is set to the AckWait value, which
is the first element in the BackOff list if present, the check
would possibly happen at this interval, even when we were past
the first redelivery and the backoff interval had increased.
The end-user would still see the redelivery be done at the durations
indicated by the BackOff list, but internally, we would be checking
at the initial BackOff's ack wait.
I added a test that uses the store's interface to detect how many
times the checkPending() function is invoked. For this test it
should have been invoked twice, but without the fix it was invoked
15 times.
Also fixed an unrelated test that could possibly deadlock causing
tests to be aborted due to inactivity on Travis.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced by the change for ipQueues in #2931.
The (*ipQueue).unregister() was written with a protection for
the ipQueue to be nil, however, mset.outq is actually not a bare
ipQueue but a jsOutQ that embeds a pointer to an ipQueue. So we
need to implement register() for jsOutQ.
Added a test that reproduced the issue, but found it with a flapping
test (TestJetStreamLongStreamNamesAndPubAck) that failed due to
a file name too long.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If accounts{} block is specified, authorization{} should not have
any user/password/token or users array defined.
The reason is that users parsed in accounts{} are associated with
their respective account but users parsed in authorization{} are
associated with the global account. If the same user name is
in both, and since internally the parsing of those 2 blocks is
completely random (even if layed out in the config in a specific
order), the outcome may be that a user is either associated with
an account or the default global account.
To minimize breaking changes, but still avoid this unexpected
outcome, the server will now detect if there are duplicate users
(or nkeys) inside authorization{} block itself, but also between
this block and accounts{}.
The check will also detect if accounts{} has any user/nkey, then
the authorization{} block should not have any user/password/token,
making this test similar to the check we had in authorization{}
block itself.
Resolves#2926
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
I got this panic in a test:
```
=== RUN TestJetStreamClusterAccountLoadFailure
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0xb1501b]
goroutine 47853 [running]:
github.com/nats-io/nats-server/v2/server.(*jetStream).processLeaderChange(0xc000b60580, 0x0)
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:3638 +0x9b
github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster(0xc000b60580)
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:853 +0x60f
created by github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine
/home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3017 +0x87
FAIL github.com/nats-io/nats-server/v2/server 227.888s
```
which from that branch would point to function processLeaderChange()
line:
```
} else if node := js.getMetaGroup().GroupLeader(); node == _EMPTY_ {
```
which I guess meant that getMetaGroup() was returning `nil`.
Refactored a bit to get the group leader in 2 steps.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removal of a stream source that was external was not working properly,
allowing messages to still flow after the removal and until the
server hosting the stream to which the source was removed was
restarted.
Resolves#2920
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>