First issue was applications not getting any response.
However, there was also a more serious issue that would create multiple raft groups for each concurrent request.
The servers would only run one stream monitor loop, however they would update the state to the new raft group's name, so on server restart the stream would be using a different raft group then existing servers.
Signed-off-by: Derek Collison <derek@nats.io>
It is possible that a stream info request would be handled at a
time where the raft group would not yet be set/created, causing
a panic.
Resolves#3626 (at least the panic reports there)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something
like this:
```
...
"meta_cluster": {
"name": "local",
"leader": "A",
"peer": "NUmM6cRx",
"replicas": [
{
"name": "B",
"current": true,
"active": 690369000,
"peer": "b2oh2L6w"
},
{
"name": "Server name unknown at this time (peerID: jZ6RvVRH)",
"current": false,
"offline": true,
"active": 0,
"peer": "jZ6RvVRH"
}
],
"cluster_size": 3
}
```
Note the "peer" field following the "leader" field that contains
the server name. The new field is the node ID, which is a hash of
the server name.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The CLI will now be able to display the peer IDs in MetaGroupInfo
if it choses to do so, and possibly help user select the peer ID
from a list with a new command to remove by peer ID instead of
by server name.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This can be helpful after a partial cluster restart since in that
case the server name may not be known. However "server report jetstream"
would report the peer ID that then can be used.
For instance here is the output after a cluster restart where server "C"
is not restarted.
```
nats -s nats://sys:pwd@localhost:4222 server report jetstream
...
â•────────────────────────────────────────────────────────────────────────────────────────────────╮
│ RAFT Meta Group Information │
├─────────────────────────────────────────────────────┬────────┬─────────┬────────┬────────┬─────┤
│ Name │ Leader │ Current │ Online │ Active │ Lag │
├─────────────────────────────────────────────────────┼────────┼─────────┼────────┼────────┼─────┤
│ A │ yes │ true │ true │ 0.00s │ 0 │
│ B │ │ true │ true │ 0.53s │ 0 │
│ Server name unknown at this time (peerID: jZ6RvVRH) │ │ false │ false │ 0.00s │ 0 │
╰─────────────────────────────────────────────────────┴────────┴─────────┴────────┴────────┴─────╯
```
With a change to the NATS CLI we could have something like:
```
nats -s nats://sys:pwd@localhost:4222 server raft peer-remove jZ6RvVRH --by_id
```
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
* changed format of JSClusterNoPeers error
This error was introduced in #3342 and reveals to much information
This change gets rid of cluster names and peer counts.
All other counts where changed to booleans,
which are only included in the output when the filter was hit.
In addition, the set of not matching tags is included.
Furthermore, the static error description in server/errors.json
is moved into selectPeerError
sample errors:
1) no suitable peers for placement, tags not matched ['cloud:GCP', 'country:US']"
2) no suitable peers for placement, insufficient storage
Signed-off-by: Matthias Hanel <mh@synadia.com>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Co-authored-by: Ivan Kozlovic <ivan@synadia.com>
This can happen if the move was initiated by the user.
A subsequent cancel resets the initial peer list.
The original peer list was picked on the old set of tags.
A cancel would then keep the new list of tags but reset
to the old peers. Thus tags and peers diverge.
The problem is that at the time of cancel, the old
placement tags can't be found anymore.
This fix causes cancel to remove the placement tags, if
the old peers do not satisfy the new placement tags.
Signed-off-by: Matthias Hanel <mh@synadia.com>
* better error when peer selection fails
It is pretty hard to diagnose what went wrong when not enough peers for
an operation where found. This change now returns counts of reasons why
peers where discarded.
Changed the error to JSClusterNoPeers as it seems more appropriate
of an error for that operation. Not having enough resources is one of
the conditions for a peer not being considered. But so is having a non
matching tag. Which is why JSClusterNoPeers seems more appropriate
In addition, JSClusterNoPeers was already used as error after one call
to selectPeerGroup already.
example:
no suitable peers for placement: peer selection cluster 'C' with 3 peers
offline: 0
excludeTag: 1
noTagMatch: 2
noSpace: 0
uniqueTag: 0
misc: 0
Examle for mqtt:
mid:12 - "mqtt" - unable to connect: create sessions stream for account "$G":
no suitable peers for placement: peer selection cluster 'MQTT' with 3 peers
offline: 0
excludeTag: 0
noTagMatch: 0
noSpace: 0
uniqueTag: 0
misc: 0
(10005)
Signed-off-by: Matthias Hanel <mh@synadia.com>
* review comment
Signed-off-by: Matthias Hanel <mh@synadia.com>
* Adding account purge operation
The new request is available for the system account.
The subject to send the request to is $JS.API.ACCOUNT.PURGE.*
With the name of the account to purge instead of the wildcard.
Also added directory cleanup code such that server do not
end up with empty streams directories and account dirs that
only contain streams
Also adding ACCOUNT to leaf node domain rewrite table
Addresses #3186 and #3306 by providing a way to
get rid of the streams for existing and non existing accounts
Signed-off-by: Matthias Hanel <mh@synadia.com>
The interest moved across the leafnode would be for the mapping, and not the actual qsub.
So when received if we did detect that we are mapped and do not have a queue filter present make sure to ignore.
This will allow queue subscriber processing on the local server that received the message from the leafnode.
Signed-off-by: Derek Collison <derek@nats.io>
If there are more stream names that the current limit of 1024, getting
the list of names would return them all instead of using pagination.
For "stream infos", the Total amount returned would be the API limit
instead of the actual number of streams.
Resolves https://github.com/nats-io/natscli/issues/541
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The `JSStreamNameExistErr` will now include in the description that
the stream exists with a different configuration, because that is
the error clients would get when trying to add a stream with a
different configuration (otherwise this is a no-op and client
don't get an error).
Since that error was used in case of restore, a new error is added
but uses the same description prefix "stream name already in use"
but adds ", cannot restore" to indicate that this is a restore
failure because the stream already exists.
Resolves#3273
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The Move/Cancel/Downscale mechanism did not take into account that
the consumer's replica count can be set independently.
This also alters peer selection to have the ability to skip
unique tag prefix check for server that will be replaced.
Say you have 3 az, and want to add another server to az:1,
in order to replace a server that is the same zone.
Without this change, uniqueTagPrefix check would filter
the server to replace with and cause a failure.
The cancel move response could not be received due to
the wrong account name.
Signed-off-by: Matthias Hanel <mh@synadia.com>
* add the ability to cancel a move in progress
Move to individual subjects for move and cancel_move
New subjects are:
$JS.API.ACCOUNT.STREAM.MOVE.*.*
$JS.API.ACCOUNT.STREAM.CANCEL_MOVE.*.*
last and second to last token are account and stream name
Signed-off-by: Matthias Hanel <mh@synadia.com>
Make sure when processing a peer removal that the stream assignment agrees.
When a new leader takes over it can resend a peer removal, and if the stream/consumer really was rescheduled we could remove by accident.
Also need to make sure that when we remove a stream we remove the node as part of the stream assignment.
If we didn't, if the same asset returned to this server we would not start up the monitoring loop.
Simplify migration logic in monitorStream, to be driven by leader only
Improved unit tests
Added failure when server not in peer list
Move command does not require server anymore
Signed-off-by: Matthias Hanel <mh@synadia.com>
Also for direct get and for pull requests, if we are not on a client connection check how long we have been away from the readloop.
If need be execute in a separate go routine.
Signed-off-by: Derek Collison <derek@nats.io>
Also added:
ability to reload tags
special tag (!jetstream) to remove peer from peer placement
$JS.API.SERVER.STREAM.MOVE subject to initiate move away from a server
This changes a detail about regular stream move as well.
Before differing cluster names where used to start/stop a transfer.
Now only the peer list and it's size relative to configured replica matter.
Once a transfer is considered completed, excess peers will be dropped
from the beginning of the list.
This allows transfers within the cluster as well.
Signed-off-by: Matthias Hanel <mh@synadia.com>