- When detecting duplicate route, it was possible that a server
would lose track of the peer's gateway URL, which would prevent
it from gossiping that URL to inbound gateway connections
- When a server has gateways enabled and has as a remote its
own gateway, the monitoring endpoint `/varz` would include it
but without the "urls" array.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When servers leave a cluster and their gateway URLs was not in
the remote cluster's configuration, it is possible that their
gateway URL do not disappear from the list of URLs in the `/varz`
monitoring endpoint.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
[fixed] reservations accounting issue on reload introduced by:
commit: bfb726e8e9
clearResources appeared to have been a workaround and broke
reload for non global accounts
Signed-off-by: Matthias Hanel <mh@synadia.com>
Updated some tests based on this change but also missing defer
connection close or server shutdown.
Fixed how the OCSP run go routine would shutdown, which would
never complete because grWG was not decremented by this go routine
prior to invoking s.Shutdown()
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Currently in tests, we have calls to os.Remove and os.RemoveAll where we
don't check the returned error. This hides useful error messages when
tests fail to run, such as "too many open files".
This change checks for more filesystem related errors and calls t.Fatal
if there is an error.
This also applies to times that end up in that json.
Where applicable moved time.Now() to where it is used.
Moved calls to .UTC() to where time is created it that time is converted
later anyway.
Signed-off-by: Matthias Hanel <mh@synadia.com>
This moves from explicit imports and subscriptions to one wildcard subscription and a single wildcard export.
Signed-off-by: Derek Collison <derek@nats.io>
The new endpoints are /jsz on http and "$SYS.REQ.SERVER.PING.JSZ" and "$SYS.REQ.SERVER.%s.JSZ".
$SYS.REQ.ACCOUNT.%s.JSZ will only return info for the particular account
Signed-off-by: Matthias Hanel <mh@synadia.com>
Only the user (from username/password connection method) was reported
in this monitoring endpoint. Will now report proper nkey, public key,
etc..
Resolves#1799
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Returned imports/exports are formated like jwt exports imports, even if
they originating account is from config.
Fixes#1604
Signed-off-by: Matthias Hanel <mh@synadia.com>
- A race test may have consumed a lot of fds going in TIME_WAIT
that could cause some issues for other tests
- Missing defer filestore.Stop() that would leave flushLoop()
routines
- A defer for the from server in a LeafNode test
- Rework [Re]ConnectErrorReports that was failing often for me
locally (probably due to exhaustion of fds - too many TIME_WAIT).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles.
This will have the server who completes the loop be the one that detects the error soley based on
its own loop detection subject.
Otehr changes are just to fix tests that were not waiting for the new LDS sub.
Signed-off-by: Derek Collison <derek@nats.io>
Updated all tests that use "async" clients.
- start the writeLoop (this is in preparation for changes in the
server that will not do send-in-place for some protocols, such
as PING, etc..)
- Added missing defers in several tests
- fixed an issue in client.go where test was wrong possibly causing
a panic.
- Had to skip a test for now since it would fail without server code
change.
The next step will be ensure that all protocols are sent through
the writeLoop and that the data is properly flushed on close (important
for -ERR for instance).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.
Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Ivan had the idea of using the CONNECT to establish a first estimate of RTT
without additional PING/PONGs.
Signed-off-by: Derek Collison <derek@nats.io>
This is achieved by subscribing to a unique subject. If the LS+
protocol is coming back for the same subject on the same account,
then this indicates a loop.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This adds a new config option server_name that
when set will be exposed in varz, events and more
as a descriptive name for the server.
If unset though the server_name will default to the pk
Signed-off-by: R.I.Pienaar <rip@devco.net>
Defaults to 1sec but will be opts.PingInterval if value is lower.
All non client connections invoked this function for the first
PING.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When a leaf or route connection is created, set the first ping
timer to fire at 1sec, which will allow to compute the RTT
reasonably soon (since the PingInterval could be user configured
and set much higher).
For Route in PR #1101, I was sending the PING on receiving the
INFO which required changing bunch of tests. Changing that to
also use the first timer interval of 1sec and reverted changes
to route tests.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- On startup, verify that local account in leafnode (if specified
can be found otherwise fail startup).
- At runtime, print error and continue trying to reconnect.
Will need to decide a better approach.
- When using basic auth (user/password), it was possible for a
solicited Leafnode connection to not use user/password when
trying an URL that was discovered through gossip. The server
now saves the credentials of a configured URL to use with
the discovered ones.
Updated RouteRTT test in case RTT does not seem to be updated
because getting always the same value.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added the RTT field to each route reported in routez.
Ensure that when a route is accepted, we send a PING to compute
the first RTT and don't have to wait for the ping timer to fire.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Such endpoint will list the gateway/cluster name, address and port
then list of outbound/inbound connections.
For each remote gateway there will be at most one outbound connection.
There can be 0 or more inbound connections for the same remote
gateway.
For each of these outbound/inbound connection, the connection info
similar to Connz is reported. Optionally, one can include the
interest mode/stats for each account.
Here are possible options:
* No specific options
http://host:port/gatewayz
* Limit to specific remote gateway, say name "B":
http://host:port/gatewayz/gw_name=B
* Include accounts (default limit to 1024 accounts)
http://host:port/gatewayz/accs=1
* Specific limit, say 200 (note accs=1 in this case is optional)
http://host:port/gatewayz/accs=1&accs_limit=200
* Specific account, say "acc_1". Note that accs=1 is not required then
http://host:port/gatewayz/acc_name=acc_1
* Above options can be mixed: specific remote gateway (B), with 100
accounts reported
http://host:port/gatewayz/gw_name=B&accs_limit=200
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- TestSystemAccountConnectionUpdatesStopAfterNoLocal: I believe that
the check on number of notifications was wrong. Since we did not
consume the ones for the connect, the expected count after the
disconnect is 8 instead of 4.
- Possible fix GW tests complaining about number of outbound/inbound
I think that it may be possible that connection does not succeed
right away (remote to fully started, etc) and due to dial timeout
and reconnect attempt delay, I suspect that when given a max time
of 1sec to complete, it may not be enough.
Quick change for now is to override to 2secs for now in the
wait helpers. If that proves conclusive, we could remove the
timeout given to these helpers.
- TestGatewaySendAllSubsBadProtocol: used a t.Fatalf() in checkFor
instead of return fmt.Errorf().
- TestLeafNodeResetsMSGProto: this test is not about change to
interest mode only, so to avoid possible mix of protos, delay
a bit creation of gateway after creation of leaf node.
- Some defer s.Shutdown() were missing
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>