Add basic peer certificates information in /connz endpoint when
the "auth" option is provided.
Resolves#3317
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We are phasing out the optimistic-only mode. Servers accepting
inbound gateway connections will switch the accounts to interest-only
mode.
The servers with outbound gateway connection will check interest
and ignore the "optimistic" mode if it is known that the corresponding
inbound is going to switch the account to interest-only. This is
done using a boolean in the gateway INFO protocol.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- didRemove in applyMetaEntries() could be reset when processing
multiple entries
- change "no race" test names to include JetStream
- separate raft nodes leader stepdown and stop in server
shutdown process
- in InstallSnapshot, call wal.Compact() with lastIndex+1
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Maybe that is the place it could be set and not in NewServer(), but
want to minimize risk of breaking something close to 2.9.0
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
A test TestJetStreamClusterLeafNodeSPOFMigrateLeaders was added at
some point that needed the remotes to stop (re)connecting. It made
use of existing leafNodeEnabled that was used for GW/Leaf interest
propagation races to disable the reconnect, but that may not be
the best approach since it could affect users embedding servers
and adding leafnodes "dynamically".
So this PR introduced a specific boolean specific for that test.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added http monitoring endpoint /accstatz
It responds with a list of statz for all accounts with local connections
the argument "unused=1" can be provided to get statz for all accounts
This endpoint is also exposed as nats request under:
This monitoring endpoint is exposed via the system account.
$SYS.REQ.ACCOUNT.*.STATZ
Each server will respond with connection statistics for the requested
account. The format of the data section is a list (size 1) identical to the event
$SYS.ACCOUNT.%s.SERVER.CONNS which is sent periodically as well as on
connect/disconnect. Unless requested by options, server without the account,
or server where the account has no local connections, will not respond.
A PING endpoint exists as well. The response format is identical to
$SYS.REQ.ACCOUNT.*.STATZ
(however the data section will contain more than one account, if they exist)
In addition to general filter options the request takes a list of accounts and
an argument to include accounts without local connections (disabled by default)
$SYS.REQ.ACCOUNT.PING.STATZ
Each account has a new system account import where the local subject
$SYS.REQ.ACCOUNT.PING.STATZ essentially responds as if
the importing account name was used for $SYS.REQ.ACCOUNT.*.STATZ
The only difference between requesting ACCOUNT.PING.STATZ from within
the system account and an account is that the later can only retrieve
statz for the account the client requests from.
Also exposed the monitoring /healthz via the system account under
$SYS.REQ.SERVER.*.HEALTHZ
$SYS.REQ.SERVER.PING.HEALTHZ
No dedicated options are available for these.
HEALTHZ also accept general filter options.
Signed-off-by: Matthias Hanel <mh@synadia.com>
The "InProcess" change make readyForConnections() possibly return
much faster than it used to, which could cause tests to fail.
Restore the original behavior, but in case of DontListen option
wait on the startupComplete gate.
Also fixed some missing checks for leafnode connections.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If a JS API request is received from a non client connection, it
was processed in its own go routine. To reduce the number of
such go routine, we were limiting the number of outstanding routines
to 4096. However, in some situations, it was possible to issue
many requests at the same time that would then cause those requests
to be dropped.
(an example was an MQTT benchmark tool that would create 5000
sessions, each with one QoS1 R1 consumer (with the use of consumer_replicas=1).
On abrupt exit of the tool, the consumers and their sessions needed
to be deleted. Since would cause fast incoming delete consumer requests
which would cause the original code to drop some of them)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Step down timing for consumers or streams.
Signals loss of leadership and sleeps before stepping down.
This makes it less likely that messages are being processed during step
down.
When becoming leader, consumer stream seqno got reset,
even though the consumer existed already.
Proper cleanup of redelivery data structures and timer
Signed-off-by: Matthias Hanel <mh@synadia.com>
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
basically a gw subject propagation issue could be hidden behind a leaf
node.
also change error text when this was the case
Signed-off-by: Matthias Hanel <mh@synadia.com>
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.
This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced in v2.6.6. In order to solve a config reload
issue, we used tls.Config.GetConfigForClient which allowed the
TLS configuration to be "refreshed" with the latest. However, in
this case, the tls.Config.ClientAuth was not reset to tls.NoClientCert
which we need for monitoring port.
Resolves#2980
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Gateway connection will be closed and error reported if a remote
has a name that is a duplicate of the local cluster.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is due to an unaligned 64-bit atomic operation. Move the
field at top of structure with 64-bit aligned preceding fields.
Resolves#2011
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced when fixing #2881. The call to setFirstPingTimer
needed to be done under the client's lock.
Moved setFirstPingTimer from a server receiver to a client receiver.
The only reason it was a server receiver is because we need the
server options, but c.srv is always set when invoking this function,
so we will get the server from c.srv in that function now.
Related to #2881
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Currently this code returns a 200 and { "status": "ok" } iff all configured ports are open
and if JetStream is configured and we have contact with the metaleader and the cluster and all streams are up to date.
Signed-off-by: Derek Collison <derek@nats.io>
We will only send if all peers in our group are >= 2.7.1 and we will check for updates.
When a consumer follower takes over it will notify all pending requests that those requests are invalid now.
Signed-off-by: Derek Collison <derek@nats.io>