If a JS API request is received from a non client connection, it
was processed in its own go routine. To reduce the number of
such go routine, we were limiting the number of outstanding routines
to 4096. However, in some situations, it was possible to issue
many requests at the same time that would then cause those requests
to be dropped.
(an example was an MQTT benchmark tool that would create 5000
sessions, each with one QoS1 R1 consumer (with the use of consumer_replicas=1).
On abrupt exit of the tool, the consumers and their sessions needed
to be deleted. Since would cause fast incoming delete consumer requests
which would cause the original code to drop some of them)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Step down timing for consumers or streams.
Signals loss of leadership and sleeps before stepping down.
This makes it less likely that messages are being processed during step
down.
When becoming leader, consumer stream seqno got reset,
even though the consumer existed already.
Proper cleanup of redelivery data structures and timer
Signed-off-by: Matthias Hanel <mh@synadia.com>
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
basically a gw subject propagation issue could be hidden behind a leaf
node.
also change error text when this was the case
Signed-off-by: Matthias Hanel <mh@synadia.com>
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.
This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced in v2.6.6. In order to solve a config reload
issue, we used tls.Config.GetConfigForClient which allowed the
TLS configuration to be "refreshed" with the latest. However, in
this case, the tls.Config.ClientAuth was not reset to tls.NoClientCert
which we need for monitoring port.
Resolves#2980
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Gateway connection will be closed and error reported if a remote
has a name that is a duplicate of the local cluster.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is due to an unaligned 64-bit atomic operation. Move the
field at top of structure with 64-bit aligned preceding fields.
Resolves#2011
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced when fixing #2881. The call to setFirstPingTimer
needed to be done under the client's lock.
Moved setFirstPingTimer from a server receiver to a client receiver.
The only reason it was a server receiver is because we need the
server options, but c.srv is always set when invoking this function,
so we will get the server from c.srv in that function now.
Related to #2881
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Currently this code returns a 200 and { "status": "ok" } iff all configured ports are open
and if JetStream is configured and we have contact with the metaleader and the cluster and all streams are up to date.
Signed-off-by: Derek Collison <derek@nats.io>
We will only send if all peers in our group are >= 2.7.1 and we will check for updates.
When a consumer follower takes over it will notify all pending requests that those requests are invalid now.
Signed-off-by: Derek Collison <derek@nats.io>
This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.
We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.
Signed-off-by: Derek Collison <derek@nats.io>
Along a leaf node connection, unless the system account is shared AND the JetStream domain name is identical, the default JetStream traffic (without a domain set) will be denied.
As a consequence, all clients that wants to access a domain that is not the one in the server they are connected to, a domain name must be specified.
Affected from this change are setups where: a leaf node had no local JetStream OR the server the leaf node connected to had no local JetStream.
One of the two accounts that are connected via a leaf node remote, must have no JetStream enabled.
The side that does not have JetStream enabled, will loose JetStream access and it's clients must set `nats.Domain` manually.
For workarounds on how to restore the old behavior, look at:
https://github.com/nats-io/nats-server/pull/2693#issuecomment-996212582
New config values added:
`default_js_domain` is a mapping from account to domain, settable when JetStream is not enabled in an account.
`extension_hint` are hints for non clustered server to start in clustered mode (and be usable to extend)
`js_domain` is a way to set the JetStream domain to use for mqtt.
Signed-off-by: Matthias Hanel <mh@synadia.com>
When creating the http server, we need to provide a TLS configuration.
After a config reload, the new TLS config would not be reflected.
We had the same issue with Websocket and was fixed with the use
of tls.Config.GetConfigForClient API, which makes the TLS handshake
to ask for a TLS config. That fix for websocket was simply not applied
to the HTTPs monitoring case.
I have also fixed some flappers due to the use of localhost instead
of 127.0.0.1 (connections possibly would resolve to some IPv6 address
that the server would not accept, etc..)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added in client kind and sub type for clients.
Added in ability to filter connections based on matching subject interest.
Signed-off-by: Derek Collison <derek@nats.io>
This is related to PR #2407. Since the 64MB pending size is actually
configurable, we should fail only if max_payload is greater than
the configured max_pending. This is done in validateOptions() which
covers both config file and direct options in embedded cases.
The check in opts.go is reverted to max int32 since at this point
we don't know if/what max_pending will be, so we simply check
that it is not more than a int32.
For the next minor release, we could have another change that
imposes a lower limit to max_payload (regardless if max_pending
is higher).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is the reverse of the early work to have LNs extend a non-JS cluster.
Also have mixed mode tests as well.
Signed-off-by: Derek Collison <derek@nats.io>
* [fixed] jetstream unique server name requirement across domains
including domain in server info
adding check for cluster name in duplicate leaf node connection check
This does not address non unique domains in the same domain, say within
super cluster.
Signed-off-by: Matthias Hanel <mh@synadia.com>