- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If set, a server configured to accept leafnode connections will
reject a remote server whose version is below that value. Note
that servers prior to v2.8.0 are not sending their version
in the CONNECT protocol, which means that anything below 2.8.0
would be rejected.
Configuration example:
```
leafnodes {
port: 7422
min_version: 2.8.0
}
```
The option is a string and can have the "v" prefix:
```
min_version: "v2.9.1"
```
Note that although suffix such as `-beta` would be accepted,
only the major, minor and update are used for the version comparison.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Data race that has been seen:
```
Read at 0x00c00134bec0 by goroutine 159:
github.com/nats-io/nats-server/v2/server.(*client).msgHeaderForRouteOrLeaf()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:2935 +0x254
github.com/nats-io/nats-server/v2/server.(*client).processMsgResults()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:4364 +0x2147
(...)
Previous write at 0x00c00134bec0 by goroutine 201:
github.com/nats-io/nats-server/v2/server.(*Server).addRoute()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:1475 +0xdb4
github.com/nats-io/nats-server/v2/server.(*client).processRouteInfo()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:641 +0x1704
```
Also fixed some flappers and removed use of `s.js.` since we have
already captured `js` in Jsz monitoring.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Kubernetes probes don't use nor log the reponse body of health
endpoints. This means that for some reason a nats node running in
Kubernetes becomes on a Not Ready state we won't have a way to know why
other than to manually access the cluster and call the /healthz endpoint
manually and see the error.
This change adds an error log so we can observe what is going wrong with
a nats node that is not ready.
Signed-off-by: Samuel Torres <samuel.torres@form3.tech>
The data from other nodes are usually wrong, this can be quite
confusing for users so we now only send it when we are the leader
Signed-off-by: R.I.Pienaar <rip@devco.net>
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Gateway connection will be closed and error reported if a remote
has a name that is a duplicate of the local cluster.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
user and activation token did not honor the jwt value for all * on
connect.
activation token where not re evaluated when the export revoked a key.
In part this is a consistency measure so servers that already have an
account and servers that don't behave the same way.
in jwt activation token revocations are stored per export.
The server stored them per account, thus effectively merging
revocations. Now they are stored per export inside the server too.
fixes nats-io/nsc/issues/442
Signed-off-by: Matthias Hanel <mh@synadia.com>
Currently this code returns a 200 and { "status": "ok" } iff all configured ports are open
and if JetStream is configured and we have contact with the metaleader and the cluster and all streams are up to date.
Signed-off-by: Derek Collison <derek@nats.io>
This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.
We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.
Signed-off-by: Derek Collison <derek@nats.io>
Appears what happens is that the getPublicConsumers()
is called which produces a list of consumers and that
between the time the list is made and the Info() is
called the ephemeral was removed.
Signed-off-by: R.I.Pienaar <rip@devco.net>
- When detecting duplicate route, it was possible that a server
would lose track of the peer's gateway URL, which would prevent
it from gossiping that URL to inbound gateway connections
- When a server has gateways enabled and has as a remote its
own gateway, the monitoring endpoint `/varz` would include it
but without the "urls" array.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
ClientID has been added to various monitoring objects. Also, added
the ability to filter connections on `client_id`.
On auth violation, the proper code was not invoked, which meant
that no disconnect event (with auth reason) would be published.
Resolves#2270
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The locking is jetStream->Server, not the otherway around. There
was few places where lock inversion could have caused deadlock.
Also, a change made recently to solve a deadlock was causing
a race that is demonstrated with TestJetStreamRaceOnRAFTCreate.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added in client kind and sub type for clients.
Added in ability to filter connections based on matching subject interest.
Signed-off-by: Derek Collison <derek@nats.io>