Commit Graph

166 Commits

Author SHA1 Message Date
Ivan Kozlovic
f6c4e5fcee [CHANGED] Gateway: Switch all accounts to interest-only mode
We are phasing out the optimistic-only mode. Servers accepting
inbound gateway connections will switch the accounts to interest-only
mode.

The servers with outbound gateway connection will check interest
and ignore the "optimistic" mode if it is known that the corresponding
inbound is going to switch the account to interest-only. This is
done using a boolean in the gateway INFO protocol.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-19 16:41:44 -06:00
Ivan Kozlovic
3c9a7cc6e5 Move to Go 1.19, remote io/util, fix data race and a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 09:55:37 -06:00
Matthias Hanel
d53d2d0484 [Added] account specific monitoring endpoint(s) (#3250)
Added http monitoring endpoint /accstatz
It responds with a list of statz for all accounts with local connections
the argument "unused=1" can be provided to get statz for all accounts
This endpoint is also exposed as nats request under:

This monitoring endpoint is exposed via the system account.
$SYS.REQ.ACCOUNT.*.STATZ
Each server will respond with connection statistics for the requested
account. The format of the data section is a list (size 1) identical to the event
$SYS.ACCOUNT.%s.SERVER.CONNS which is sent periodically as well as on
connect/disconnect. Unless requested by options, server without the account,
or server where the account has no local connections, will not respond.

A PING endpoint exists as well. The response format is identical to
$SYS.REQ.ACCOUNT.*.STATZ
(however the data section will contain more than one account, if they exist)
In addition to general filter options the request takes a list of accounts and
an argument to include accounts without local connections (disabled by default)
$SYS.REQ.ACCOUNT.PING.STATZ

Each account has a new system account import where the local subject
$SYS.REQ.ACCOUNT.PING.STATZ essentially responds as if
the importing account name was used for $SYS.REQ.ACCOUNT.*.STATZ

The only difference between requesting ACCOUNT.PING.STATZ from within
the system account and an account is that the later can only retrieve
statz for the account the client requests from.

Also exposed the monitoring /healthz via the system account under
$SYS.REQ.SERVER.*.HEALTHZ
$SYS.REQ.SERVER.PING.HEALTHZ
No dedicated options are available for these.
HEALTHZ also accept general filter options.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-07-12 21:50:32 +02:00
Derek Collison
9b7c81c37e Some tests improvements on non-standard JS cluster setup
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-03 12:45:27 -07:00
Ivan Kozlovic
5261d98781 [ADDED] Monitoring: Routez's individual route has now more info
Added Start, LastActivity, Uptime and Idle that we normally have
in a Connz for non route connections. This info can be useful
to determine if a route is recent, etc..

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-18 13:18:53 -06:00
Ivan Kozlovic
14f54b8dd7 [ADDED] Monitoring: MQTT and Websocket blocks in /varz endpoint
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-04 10:11:55 -06:00
R.I.Pienaar
4c4aa3e87f skips jsz on non js machines when leader only requested
This is a regression introduced in 055703f4fa
that leads to panics in management tooling

Signed-off-by: R.I.Pienaar <rip@devco.net>
2022-03-31 12:02:09 +02:00
Ivan Kozlovic
7bb7309f4c [FIXED] Monitoring: verify_and_map in tls{} config would break monitoring
This was introduced in v2.6.6. In order to solve a config reload
issue, we used tls.Config.GetConfigForClient which allowed the
TLS configuration to be "refreshed" with the latest. However, in
this case, the tls.Config.ClientAuth was not reset to tls.NoClientCert
which we need for monitoring port.

Resolves #2980

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-30 18:50:52 -06:00
R.I.Pienaar
055703f4fa ensures the cluster info in jsz is sent from the leader only
The data from other nodes are usually wrong, this can be quite
confusing for users so we now only send it when we are the leader

Signed-off-by: R.I.Pienaar <rip@devco.net>
2022-03-25 18:27:35 +01:00
Ivan Kozlovic
b4128693ed Ensure file path is correct during stream restore
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.

Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:31:51 -07:00
Ivan Kozlovic
0fae8067ae [FIXED] Some lock inversions
The established ordering is client -> Account, so fixed few places
where we had Account -> client.

Added a new file, locksordering.txt with the list of known ordering
for some of the objects.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 09:47:37 -07:00
Matt Stephenson
59cc0f0015 Add source and mirror info to stream monitoring 2022-01-21 12:44:42 -08:00
Derek Collison
52da55c8c6 Implement overflow placement for JetStream streams.
This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.

We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.

Signed-off-by: Derek Collison <derek@nats.io>
2022-01-06 19:33:08 -08:00
Ivan Kozlovic
40c0f03153 [FIXED] Monitoring: tls configuration not updated on reload
When creating the http server, we need to provide a TLS configuration.
After a config reload, the new TLS config would not be reflected.

We had the same issue with Websocket and was fixed with the use
of tls.Config.GetConfigForClient API, which makes the TLS handshake
to ask for a TLS config. That fix for websocket was simply not applied
to the HTTPs monitoring case.

I have also fixed some flappers due to the use of localhost instead
of 127.0.0.1 (connections possibly would resolve to some IPv6 address
that the server would not accept, etc..)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-11-30 10:18:46 -07:00
Ivan Kozlovic
5fc9e0e1cc [FIXED] Gateway URLs gossip and /varz report issues
- When detecting duplicate route, it was possible that a server
would lose track of the peer's gateway URL, which would prevent
it from gossiping that URL to inbound gateway connections
- When a server has gateways enabled and has as a remote its
own gateway, the monitoring endpoint `/varz` would include it
but without the "urls" array.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-10-28 12:05:30 -06:00
Ivan Kozlovic
0bd38bd424 [FIXED] Monitoring: /varz gateway URLs not always updated
When servers leave a cluster and their gateway URLs was not in
the remote cluster's configuration, it is possible that their
gateway URL do not disappear from the list of URLs in the `/varz`
monitoring endpoint.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-10-26 13:11:06 -06:00
Matthias Hanel
29a6367889 incorporating review comments.
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-09-21 17:13:53 -04:00
Matthias Hanel
9911b37b0c [added] value to JS stats showing memory used from accounts with reservations
[fixed] reservations accounting issue on reload introduced by:
commit: bfb726e8e9
clearResources appeared to have been a workaround and broke
reload for non global accounts

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-09-21 16:35:24 -04:00
Matthias Hanel
5b9d20871d Exposing reserved memory in jsz/varz
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-09-16 18:55:50 -04:00
Ivan Kozlovic
a025ce7472 Set defaultServerOptions port to -1 for random
Updated some tests based on this change but also missing defer
connection close or server shutdown.

Fixed how the OCSP run go routine would shutdown, which would
never complete because grWG was not decremented by this go routine
prior to invoking s.Shutdown()

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-09-02 14:22:56 -06:00
Derek Collison
944dd248c4 Fix for tests
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-14 17:39:51 -07:00
Derek Collison
b3f9166b4f [FIXED] Getting varz from the http endoint saw Subscriptions always double for each fetch.
Resolves part of #2170

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-03 18:43:07 -07:00
Matthias Hanel
4430a55eed [added] leaf deny exports/imports to varz monitoring (#2159)
* [added] leaf deny exports/imports to varz monitoring

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-26 16:34:09 -04:00
Jaime Piña
27e9628c3a Run gofmt -s to simplify code 2021-04-09 15:18:06 -07:00
Jaime Piña
d929ee1348 Check errors when removing test directories and files
Currently in tests, we have calls to os.Remove and os.RemoveAll where we
don't check the returned error. This hides useful error messages when
tests fail to run, such as "too many open files".

This change checks for more filesystem related errors and calls t.Fatal
if there is an error.
2021-04-07 11:09:47 -07:00
Derek Collison
43b9017b74 Merge pull request #1953 from nats-io/api
JetStream API Changes
2021-03-02 19:46:00 -07:00
Matthias Hanel
c50ee2a1c6 [Changed] all times exposed will be computed in UTC (#1943)
This also applies to times that end up in that json.
Where applicable moved time.Now() to where it is used.
Moved calls to .UTC() to where time is created it that time is converted
later anyway.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-03-02 21:37:42 -05:00
Derek Collison
4f7fbefc7c In clustered JetStream we need to move API calls out of routes/gateways/leafnodes path.
This moves from explicit imports and subscriptions to one wildcard subscription and a single wildcard export.

Signed-off-by: Derek Collison <derek@nats.io>
2021-03-02 17:54:41 -08:00
Matthias Hanel
10154c5388 [added] system_account to varz/accounts and is_system to accountz (#1898)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-02-08 15:58:53 -05:00
Derek Collison
c11a733502 Broken test for non MarshalIndent
Signed-off-by: Derek Collison <derek@nats.io>
2021-02-07 05:08:22 -08:00
Matthias Hanel
7b7543d298 [added] jsz nats and http monitoring endpoint for jetstream (#1881)
The new endpoints are /jsz on http and "$SYS.REQ.SERVER.PING.JSZ" and "$SYS.REQ.SERVER.%s.JSZ".
$SYS.REQ.ACCOUNT.%s.JSZ will only return info for the particular account

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-02-05 18:46:04 -05:00
Matthias Hanel
f487429d9e incorporated comments
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-01-29 13:25:02 -05:00
Matthias Hanel
2a34f0daee [added] field to varz output containing the operator jwt/claim
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-01-29 12:32:40 -05:00
Matthias Hanel
c9e0eb6c3a [added] cluster/gateway/leafnode tls required/verify/timeout to varz (#1854)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-01-28 14:08:58 -05:00
Matthias Hanel
9081646109 [added] support for tags and filter ping monitoring requests by tags (#1832)
fixes #1588

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-01-21 21:16:09 -05:00
Ivan Kozlovic
f50c655e75 [FIXED] Monitoring endpoint connz?auth=true show incorrect user
Only the user (from username/password connection method) was reported
in this monitoring endpoint. Will now report proper nkey, public key,
etc..

Resolves #1799

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-11 12:59:05 -07:00
Ivan Kozlovic
df9d5f5fd9 Accepting route warns if remote server has same name
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-10-08 17:59:33 -06:00
Matthias Hanel
4ff7b280f4 Avoid unnecessary CONNS subscription
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-10-05 18:25:51 -04:00
Matthias Hanel
d501a811b8 [Added] filtering by account to leafz and exposing this as per acc subj
On the monitoring endpoint /leafz specify ?acc=<account id>

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-09-24 17:23:36 -04:00
Matthias Hanel
634ce9f7c8 [Adding] Accountz monitoring endpoint and INFO monitoring req subject
Returned imports/exports are formated like jwt exports imports, even if
they originating account is from config.

Fixes #1604

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-09-23 16:22:48 -04:00
Matthias Hanel
d086a39b64 Add filtering by name and cluster to PING events
On cluster name change, reset internalSendLoop so it picks up the
changed name.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-06-16 18:26:35 -04:00
Derek Collison
146d8f5dcb Updates based on feedback, sped up some slow tests
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-12 17:26:43 -07:00
Derek Collison
c1ffd48638 Add account details to subsz.
Also allow ability to filter based on account.

Signed-off-by: Derek Collison <derek@nats.io>
2020-06-05 05:53:01 -07:00
aricart
e7590f3065 jwt2 testbed 2020-06-01 18:00:13 -04:00
Derek Collison
05e38ae527 Merge branch 'master' into sys-acc 2020-06-01 11:53:14 -07:00
Derek Collison
2bd7553c71 System Account on by default.
Most of the changes are to turn it off for tests that were watching subscriptions and such.

Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 17:56:45 -07:00
Ivan Kozlovic
44e78a1fb6 Fixed some tests
- A race test may have consumed a lot of fds going in TIME_WAIT
that could cause some issues for other tests
- Missing defer filestore.Stop() that would leave flushLoop()
routines
- A defer for the from server in a LeafNode test
- Rework [Re]ConnectErrorReports that was failing often for me
locally (probably due to exhaustion of fds - too many TIME_WAIT).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-29 17:47:08 -06:00
Guilherme Santos
25858cba0b Implement basePath for monitoring endpoints 2020-05-13 23:29:11 +02:00
Matthias Hanel
136feb9bc6 [FIXEd] subsz monitoring endpoint did not account for accounts.
Fixes  #1371 and #1357 by adding up stats and collecting subscriptions
from all accounts.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-05-06 15:48:51 -04:00
Derek Collison
699502de8f Detection for loops with leafnodes.
We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles.
This will have the server who completes the loop be the one that detects the error soley based on
its own loop detection subject.

Otehr changes are just to fix tests that were not waiting for the new LDS sub.

Signed-off-by: Derek Collison <derek@nats.io>
2020-04-08 20:00:40 -07:00