I have seen cases, maybe due to previous issue with configuration
reload that would miss subscriptions in the sublist because
of the sublist swap, where we would attempt to remove subscriptions
by batch but some were not present. I would have expected that
all present subscriptions would still be removed, even if the
call overall returned an error.
This is now fixed and a test has been added demonstrating that
even on error, we remove all subscriptions that were present.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
One should not access s.opts directly but instead use s.getOpts().
Also, server lock needs to be released when performing an account
lookup (since this may result in server lock being acquired).
A function was calling s.LookupAccount under the client lock, which
technically creates a lock inversion situation.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
A new field in `tls{}` blocks force the server to do TLS handshake
before sending the INFO protocol.
```
leafnodes {
port: 7422
tls {
cert_file: ...
...
handshake_first: true
}
remotes [
{
url: tls://host:7423
tls {
...
handshake_first: true
}
}
]
}
```
Note that if `handshake_first` is set in the "accept" side, the
first `tls{}` block in the example above, a server trying to
create a LeafNode connection to this server would need to have
`handshake_first` set to true inside the `tls{}` block of
the corresponding remote.
Configuration reload of leafnodes is generally not supported,
but TLS certificates can be reloaded and the support for this
new field was also added.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This can happen when we reset a stream internally and the stream had a prior snapshot.
Also make sure to always release resources back to the account regardless if the store is no longer present.
Signed-off-by: Derek Collison <derek@nats.io>
In that mode, a server accepts and will switch to same compression
level than the remote (if one is set) but will not initiate compression.
So if all servers in a cluster do not have compression setting set,
it defaults to "accept" which means that compression is "off".
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The new field `compression` in the `cluster{}` block allows to
specify which compression mode to use between servers.
It can be simply specified as a boolean or a string for the
simple modes, or as an object for the "s2_auto" mode where
a list of RTT thresholds can be specified.
By default, if no compression field is specified, the server
will use the s2_auto mode with default RTT thresholds of
10ms, 50ms and 100ms for the "uncompressed", "fast", "better"
and "best" modes.
```
cluster {
..
# Possible values are "disabled", "off", "enabled", "on",
# "accept", "s2_fast", "s2_better", "s2_best" or "s2_auto"
compression: s2_fast
}
```
To specify a different list of thresholds for the s2_auto,
here is how it would look like:
```
cluster {
..
compression: {
mode: s2_auto
# This means that for RTT up to 5ms (included), then
# the compression level will be "uncompressed", then
# from 5ms+ to 15ms, the mode will switch to "s2_fast",
# then from 15ms+ to 50ms, the level will switch to
# "s2_better", and anything above 50ms will result
# in the "s2_best" compression mode.
rtt_thresholds: [5ms, 15ms, 50ms]
}
}
```
Note that the "accept" mode means that a server will accept
compression from a remote and switch to that same compression
mode, but will otherwise not initiate compression. That is,
if 2 servers are configured with "accept", then compression
will actually be "off". If one of the server had say s2_fast
then they would both use this mode.
If a server has compression mode set (other than "off") but
connects to an older server, there will be no compression between
those 2 routes.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We needed to make sure at the lowest level that the state was read from disk and not depend on upper layer consumer.
Signed-off-by: Derek Collison <derek@nats.io>
This PR effectively reverts part of #4084 which removed the coalescing
from the outbound queues as I initially thought it was the source of a
race condition.
Further investigation has proven that not only was that untrue (the race
actually came from the WebSocket code, all coalescing operations happen
under the client lock) but removing the coalescing also worsens
performance.
Signed-off-by: Neil Twigg <neil@nats.io>
This is specifically when a cluster is reconfigured and the servers are
restarted with a new cluster name.
Signed-off-by: Derek Collison <derek@nats.io>
Originally I thought there was a race condition happening here,
but it turns out it is safe after all and the race condition I
was seeing was due to other problems in the WebSocket code.
Signed-off-by: Neil Twigg <neil@nats.io>
In single server mode healthz could mistake a snapshot staging
direct…ory during a restore as an account.
If the restore took a long time, stalled, or was aborted, would cause
healthz to fail.
Signed-off-by: Derek Collison <derek@nats.io>