The `JSStreamNameExistErr` will now include in the description that
the stream exists with a different configuration, because that is
the error clients would get when trying to add a stream with a
different configuration (otherwise this is a no-op and client
don't get an error).
Since that error was used in case of restore, a new error is added
but uses the same description prefix "stream name already in use"
but adds ", cannot restore" to indicate that this is a restore
failure because the stream already exists.
Resolves#3273
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The Move/Cancel/Downscale mechanism did not take into account that
the consumer's replica count can be set independently.
This also alters peer selection to have the ability to skip
unique tag prefix check for server that will be replaced.
Say you have 3 az, and want to add another server to az:1,
in order to replace a server that is the same zone.
Without this change, uniqueTagPrefix check would filter
the server to replace with and cause a failure.
The cancel move response could not be received due to
the wrong account name.
Signed-off-by: Matthias Hanel <mh@synadia.com>
With the previous commit, this should no longer be possible, but
otherwise we got report of a panic when accessing the mirror configuration.
The following test would be able to produce the panic:
```
s := RunBasicJetStreamServer()
if config := s.JetStreamConfig(); config != nil {
defer removeDir(t, config.StoreDir)
}
defer s.Shutdown()
nc, js := jsClientConnect(t, s)
defer nc.Close()
_, err := js.AddStream(&nats.StreamConfig{Name: "SOURCE"})
require_NoError(t, err)
cfg := &nats.StreamConfig{
Name: "M",
Mirror: &nats.StreamSource{Name: "SOURCE"},
}
_, err = js.AddStream(cfg)
require_NoError(t, err)
err = js.DeleteStream("SOURCE")
require_NoError(t, err)
cfg.Mirror = nil
_, err = js.UpdateStream(cfg)
require_NoError(t, err)
time.Sleep(5 * time.Second)
```
Again, now that we reject the mirror config update, the panic
should no longer happen, but adding preventive code in case we
allow in the future.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Updates to stream mirror config are rejected in cluster mode, but
were not in standalone. This PR adds the check in standalone mode.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
* add the ability to cancel a move in progress
Move to individual subjects for move and cancel_move
New subjects are:
$JS.API.ACCOUNT.STREAM.MOVE.*.*
$JS.API.ACCOUNT.STREAM.CANCEL_MOVE.*.*
last and second to last token are account and stream name
Signed-off-by: Matthias Hanel <mh@synadia.com>
Added http monitoring endpoint /accstatz
It responds with a list of statz for all accounts with local connections
the argument "unused=1" can be provided to get statz for all accounts
This endpoint is also exposed as nats request under:
This monitoring endpoint is exposed via the system account.
$SYS.REQ.ACCOUNT.*.STATZ
Each server will respond with connection statistics for the requested
account. The format of the data section is a list (size 1) identical to the event
$SYS.ACCOUNT.%s.SERVER.CONNS which is sent periodically as well as on
connect/disconnect. Unless requested by options, server without the account,
or server where the account has no local connections, will not respond.
A PING endpoint exists as well. The response format is identical to
$SYS.REQ.ACCOUNT.*.STATZ
(however the data section will contain more than one account, if they exist)
In addition to general filter options the request takes a list of accounts and
an argument to include accounts without local connections (disabled by default)
$SYS.REQ.ACCOUNT.PING.STATZ
Each account has a new system account import where the local subject
$SYS.REQ.ACCOUNT.PING.STATZ essentially responds as if
the importing account name was used for $SYS.REQ.ACCOUNT.*.STATZ
The only difference between requesting ACCOUNT.PING.STATZ from within
the system account and an account is that the later can only retrieve
statz for the account the client requests from.
Also exposed the monitoring /healthz via the system account under
$SYS.REQ.SERVER.*.HEALTHZ
$SYS.REQ.SERVER.PING.HEALTHZ
No dedicated options are available for these.
HEALTHZ also accept general filter options.
Signed-off-by: Matthias Hanel <mh@synadia.com>
In some situations, a server may report that a remote server is
detected as orphaned (and the node is marked as offline). This is
because the orphaned detection relies on conns update to be received,
however, servers would suppress the update if an account does not
have any connections attached.
This PR ensures that the update is sent regardless if the account
is JS configured (not necessarily enabled at the moment).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The test was lowering the eventsHBInterval to low value but not
restoring its original value. This would cause some servers to
be considered "orphan" and the shutdown event sent causing their
raft node to be considered "offline", which then would result
in some tests failing with "insufficient resources".
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
A customer experienced and endless failure to have a stream cacthup. The current leader was being asked for a message from a snapshot that was larger then what we had, resulting in EOF which silently failed.
We now detect this and signal end of catchup and redo the bad snapshot if possible.
Signed-off-by: Derek Collison <derek@nats.io>
When increasing the replica count unique tags for already existing peers
where ignored, which could lead to bad placement
Signed-off-by: Matthias Hanel <mh@synadia.com>
We use to ignore if the seq was 0, but now would treat it as
a requirement that the stream be empty if header is present but
set to 0.
This relates to client PR: https://github.com/nats-io/nats.go/pull/958
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Make sure when processing a peer removal that the stream assignment agrees.
When a new leader takes over it can resend a peer removal, and if the stream/consumer really was rescheduled we could remove by accident.
Also need to make sure that when we remove a stream we remove the node as part of the stream assignment.
If we didn't, if the same asset returned to this server we would not start up the monitoring loop.
Simplify migration logic in monitorStream, to be driven by leader only
Improved unit tests
Added failure when server not in peer list
Move command does not require server anymore
Signed-off-by: Matthias Hanel <mh@synadia.com>
This allows a solciting leafnode config to ask that any JetStream cluster assets that are a current leader have the leader stepdown.
Signed-off-by: Derek Collison <derek@nats.io>
This effectively means that requests with batch > 1 will process a message and go to the end of the line.
Signed-off-by: Derek Collison <derek@nats.io>