Currently in tests, we have calls to os.Remove and os.RemoveAll where we
don't check the returned error. This hides useful error messages when
tests fail to run, such as "too many open files".
This change checks for more filesystem related errors and calls t.Fatal
if there is an error.
This also applies to times that end up in that json.
Where applicable moved time.Now() to where it is used.
Moved calls to .UTC() to where time is created it that time is converted
later anyway.
Signed-off-by: Matthias Hanel <mh@synadia.com>
This moves from explicit imports and subscriptions to one wildcard subscription and a single wildcard export.
Signed-off-by: Derek Collison <derek@nats.io>
The new endpoints are /jsz on http and "$SYS.REQ.SERVER.PING.JSZ" and "$SYS.REQ.SERVER.%s.JSZ".
$SYS.REQ.ACCOUNT.%s.JSZ will only return info for the particular account
Signed-off-by: Matthias Hanel <mh@synadia.com>
Only the user (from username/password connection method) was reported
in this monitoring endpoint. Will now report proper nkey, public key,
etc..
Resolves#1799
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Returned imports/exports are formated like jwt exports imports, even if
they originating account is from config.
Fixes#1604
Signed-off-by: Matthias Hanel <mh@synadia.com>
- A race test may have consumed a lot of fds going in TIME_WAIT
that could cause some issues for other tests
- Missing defer filestore.Stop() that would leave flushLoop()
routines
- A defer for the from server in a LeafNode test
- Rework [Re]ConnectErrorReports that was failing often for me
locally (probably due to exhaustion of fds - too many TIME_WAIT).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles.
This will have the server who completes the loop be the one that detects the error soley based on
its own loop detection subject.
Otehr changes are just to fix tests that were not waiting for the new LDS sub.
Signed-off-by: Derek Collison <derek@nats.io>
Updated all tests that use "async" clients.
- start the writeLoop (this is in preparation for changes in the
server that will not do send-in-place for some protocols, such
as PING, etc..)
- Added missing defers in several tests
- fixed an issue in client.go where test was wrong possibly causing
a panic.
- Had to skip a test for now since it would fail without server code
change.
The next step will be ensure that all protocols are sent through
the writeLoop and that the data is properly flushed on close (important
for -ERR for instance).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.
Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Ivan had the idea of using the CONNECT to establish a first estimate of RTT
without additional PING/PONGs.
Signed-off-by: Derek Collison <derek@nats.io>
This is achieved by subscribing to a unique subject. If the LS+
protocol is coming back for the same subject on the same account,
then this indicates a loop.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This adds a new config option server_name that
when set will be exposed in varz, events and more
as a descriptive name for the server.
If unset though the server_name will default to the pk
Signed-off-by: R.I.Pienaar <rip@devco.net>
Defaults to 1sec but will be opts.PingInterval if value is lower.
All non client connections invoked this function for the first
PING.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When a leaf or route connection is created, set the first ping
timer to fire at 1sec, which will allow to compute the RTT
reasonably soon (since the PingInterval could be user configured
and set much higher).
For Route in PR #1101, I was sending the PING on receiving the
INFO which required changing bunch of tests. Changing that to
also use the first timer interval of 1sec and reverted changes
to route tests.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- On startup, verify that local account in leafnode (if specified
can be found otherwise fail startup).
- At runtime, print error and continue trying to reconnect.
Will need to decide a better approach.
- When using basic auth (user/password), it was possible for a
solicited Leafnode connection to not use user/password when
trying an URL that was discovered through gossip. The server
now saves the credentials of a configured URL to use with
the discovered ones.
Updated RouteRTT test in case RTT does not seem to be updated
because getting always the same value.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added the RTT field to each route reported in routez.
Ensure that when a route is accepted, we send a PING to compute
the first RTT and don't have to wait for the ping timer to fire.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Such endpoint will list the gateway/cluster name, address and port
then list of outbound/inbound connections.
For each remote gateway there will be at most one outbound connection.
There can be 0 or more inbound connections for the same remote
gateway.
For each of these outbound/inbound connection, the connection info
similar to Connz is reported. Optionally, one can include the
interest mode/stats for each account.
Here are possible options:
* No specific options
http://host:port/gatewayz
* Limit to specific remote gateway, say name "B":
http://host:port/gatewayz/gw_name=B
* Include accounts (default limit to 1024 accounts)
http://host:port/gatewayz/accs=1
* Specific limit, say 200 (note accs=1 in this case is optional)
http://host:port/gatewayz/accs=1&accs_limit=200
* Specific account, say "acc_1". Note that accs=1 is not required then
http://host:port/gatewayz/acc_name=acc_1
* Above options can be mixed: specific remote gateway (B), with 100
accounts reported
http://host:port/gatewayz/gw_name=B&accs_limit=200
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- TestSystemAccountConnectionUpdatesStopAfterNoLocal: I believe that
the check on number of notifications was wrong. Since we did not
consume the ones for the connect, the expected count after the
disconnect is 8 instead of 4.
- Possible fix GW tests complaining about number of outbound/inbound
I think that it may be possible that connection does not succeed
right away (remote to fully started, etc) and due to dial timeout
and reconnect attempt delay, I suspect that when given a max time
of 1sec to complete, it may not be enough.
Quick change for now is to override to 2secs for now in the
wait helpers. If that proves conclusive, we could remove the
timeout given to these helpers.
- TestGatewaySendAllSubsBadProtocol: used a t.Fatalf() in checkFor
instead of return fmt.Errorf().
- TestLeafNodeResetsMSGProto: this test is not about change to
interest mode only, so to avoid possible mix of protos, delay
a bit creation of gateway after creation of leaf node.
- Some defer s.Shutdown() were missing
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
In varz's cluster{} section, there was no URLs field. This PR adds
it and displays the routes defined in the cluster{} config section.
The value gets updated should there be a config reload following
addition/removal of an url from "routes".
If config had 1 route to "nats://127.0.0.1:1234", here is what
it would look like now:
```
"cluster": {
"addr": "0.0.0.0",
"cluster_port": 6222,
"auth_timeout": 1,
"urls": [
"127.0.0.1:1234"
]
},
```
Adding route to "127.0.0.1:4567" and doing config reload:
```
"cluster": {
"addr": "0.0.0.0",
"cluster_port": 6222,
"auth_timeout": 1,
"urls": [
"127.0.0.1:1234",
"127.0.0.1:4567"
]
},
```
Note that due to how we handle discovered servers in the cluster,
new urls dynamically discovered will not show in above output.
This could be done, but would need some changes in how we store
things (actually in this case, new urls are not stored, just
attempted to be connected. Once they connect, they would be visible
in /routez).
For gateways, however, this PR displays the combination of the
URLs defined in config and the ones that are discovered after
a connection is made to a give cluster. So say cluster A has a single
url to one server in cluster B, when connecting to that server,
the server on A will get the list of the gateway URLs that one
can connect to, and these will be reflected in /varz. So this is
a different behavior that for routes. As explained above, we could
harmonize the behavior in a future PR.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
----------------------------------------------------------------
Backward-incompatibility note:
Varz used to embed *Info and *Options which are other server objects.
However, Info is a struct that servers used to send protocols to other
servers or clients and its content must contain json tags since we
need to marshal those to be sent over. The problem is that it made
those fields now accessible to users calling Varz() and also visible
to the http /varz output. Some fields in Info were introduced in the
2.0 branch that clashed with json tag in Options, which made cluster{}
for instance disappear in the /varz output - because a Cluster string
in Info has the same json tag, and Cluster in Info is empty in some
cases.
For users that embed NATS and were using Server.Varz() directly,
without the use of the monitoring endpoint, they were then given
access (which was not the intent) to server internals (Info and Options).
Fields that were in Info or Options or directly in Varz that did not
clash with each other could be referenced directly, for instace, this
is you could access the server ID:
v, _ := s.Varz(nil)
fmt.Println(v.ID)
Another way would be:
fmt.Println(v.Info.ID)
Same goes for fields that were brought from embedding the Options:
fmt.Println(v.MaxConn)
or
fmt.Println(v.Options.MaxConn)
We have decided to explicitly define fields in Varz, which means
that if you previously accessed fields through v.Info or v.Options,
you will have to update your code to use the corresponding field
directly: v.ID or v.MaxConn for instance.
So fields were also duplicated between Info/Options and Varz itself
so depending on which one your application was accessing, you may
have to update your code.
---------------------------------------------------------------
Other issues that have been fixed is races that were introduced
by the fact that the creation of a Varz object (pointing to
some server data) was done under server lock, but marshaling not
being done under that lock caused races.
The fact that object returned to user through Server.Varz() also
had references to server internal objects had to be fixed by
returning deep copy of those internal objects.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>