During elected stepdown and transfer allow the new leader to take over before we stepdown.
We could receive a leader change, so make sure to also check migration state.
Signed-off-by: Derek Collison <derek@nats.io>
* [Adding] max_ha_assets to limit placement on server with more ha assets
server running more than max_ha_assets #raft nodes will not be used to
place new streams and fail if not enough free server can be found.
Durable Consumer creation on such server will fail as their peer size is
bound to the same set as their stream.
This also avoids updating placement where no new placement is needed.
This is the case when, on update, placement tags get removed.
Signed-off-by: Matthias Hanel <mh@synadia.com>
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The system will allow an update to a stream, and subsequently all attached consumers, to be placed in another cluster either directly or via tag placement.
The meta layer will scale the underlying peerset appropriately to straddle the two clusters for both the stream and consumers, taking into account the consumer type.
Control will then pass to the current leaders of the assets who will monitor the catchup status of the new peers.
(Note we can optimize this later to only traverse once across a GW for any given asset, but for now this is simpler)
Once the original leaders have determined the assets are synched it will pass leadership to a member of the new peerset.
Once the new leader has been elected, it will forward a request for the meta layer to shrink the peerset by removing the old peers.
Signed-off-by: Derek Collison <derek@nats.io>
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.
This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
adds unit test to test this scenario
improves reporting of correct error
only show info for non existing tiers where streams exist
Signed-off-by: Matthias Hanel <mh@synadia.com>
* added max_ack_penind setting to js account limits
because of the addition, defaults now have to be set later (depend on
these new limits now)
also re-organized the code to closer track how stream create looks
Signed-off-by: Matthias Hanel <mh@synadia.com>
Also fixed a bug that could cause memory based replicated consumers to no longer work after snapshots and server restarts.
The snapshot logic would allow non-state changing updates to continously grow the raft logs. We also were too conservative on when we snapshotted and why.
Also added in ability to have FileStore.Compact() reclaim space from the block file from the head of last changed block.
Signed-off-by: Derek Collison <derek@nats.io>
* Adding server limits (max ack pending/dedupe window) to js config
Also shifting consumer config check to jsConsumerCreate as in clustered
mode this was enforced in the wrong place
Signed-off-by: Matthias Hanel <mh@synadia.com>
Also fixed a bug where we were incorrectly not spining up the monitoring loop for a stream when going from 3->1->3.
Signed-off-by: Derek Collison <derek@nats.io>
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.
The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.
Also fixed some flappers and a bug where de-dupe might not be reformed correctly.
Signed-off-by: Derek Collison <derek@nats.io>
I got this panic in a test:
```
=== RUN TestJetStreamClusterAccountLoadFailure
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0xb1501b]
goroutine 47853 [running]:
github.com/nats-io/nats-server/v2/server.(*jetStream).processLeaderChange(0xc000b60580, 0x0)
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:3638 +0x9b
github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster(0xc000b60580)
/home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:853 +0x60f
created by github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine
/home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3017 +0x87
FAIL github.com/nats-io/nats-server/v2/server 227.888s
```
which from that branch would point to function processLeaderChange()
line:
```
} else if node := js.getMetaGroup().GroupLeader(); node == _EMPTY_ {
```
which I guess meant that getMetaGroup() was returning `nil`.
Refactored a bit to get the group leader in 2 steps.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We also would hang if no stream info requests were sent during a stream list due to the asset being offline.
Signed-off-by: Derek Collison <derek@nats.io>
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.
Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When a filtered consumer who has no state, meaning no messages are being processed, it still will receive updates to properly track the delivered sequence as it relates to the entire stream.
Since we did not have state we were inadvertently skipping the compaction logic for the raft store.
Signed-off-by: Derek Collison <derek@nats.io>
This change allows a bit better logging on startup to more easily map a RAFT log directory etc to the stream/consumer.
Signed-off-by: Derek Collison <derek@nats.io>
In clustering mode, the number of consumers in stream info may be
wrong in presence of non durable consumers. Ephemeral are handled
by specific nodes. The StreamInfo response would contain only the
consumer count that the stream leader is handling.
This fix overrides the stream's state consumers count with the
number of consumers from the stream assignment record.
Resolves#2895
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We will only send if all peers in our group are >= 2.7.1 and we will check for updates.
When a consumer follower takes over it will notify all pending requests that those requests are invalid now.
Signed-off-by: Derek Collison <derek@nats.io>
This should help with GC pressure, however, it may have an effect
on performance (based on some benchmark). Calling sync.Pool.Get/Put
too often has a performance impact...
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>