Currently in tests, we have calls to os.Remove and os.RemoveAll where we
don't check the returned error. This hides useful error messages when
tests fail to run, such as "too many open files".
This change checks for more filesystem related errors and calls t.Fatal
if there is an error.
Currently, temporary test files and directories are written in lots of
different paths within the OS's temp dir. This makes it hard to know
which files are from nats-server and which are unrelated. This in turn
makes it hard to clean up nats-server test files.
Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters.
Signed-off-by: Derek Collison <derek@nats.io>
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.
Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When a leaf or route connection is created, set the first ping
timer to fire at 1sec, which will allow to compute the RTT
reasonably soon (since the PingInterval could be user configured
and set much higher).
For Route in PR #1101, I was sending the PING on receiving the
INFO which required changing bunch of tests. Changing that to
also use the first timer interval of 1sec and reverted changes
to route tests.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added the RTT field to each route reported in routez.
Ensure that when a route is accepted, we send a PING to compute
the first RTT and don't have to wait for the ping timer to fire.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When a route is established, it is possible that each server sends
its list of subscriptions to each other at the same time. Doing
it in place from the readLoop could then cause problems because
each side could reach a point where the outbound socket buffer
is full and no one is dequeuing data (since readLoop is doing
the send of the subs list).
We changed sending this list from a go routine. However, for small
number of subscriptions, it is not required and was causing some
of the tests to fail because of timing issues.
We will now send in place if the estimated size of all protocols
is below a give threshold (1MB).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We now send A- if an account does not exists, or if there is no
interest on a given subject and no existing subscription.
An A+ is sent if an A- was previously sent and a subscription
for this account is registered.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is an issue in master only, not in any public release.
The issue is that permissions should be assigned as-is for the
route perms because Publish/Subscribe could be nil, so trying
to dereference Publish.Allow/Deny or Subscribe.Allow/Deny could
crash. The code checking for permissions correctly check if
Publish/Subscribe is nil or not.
This was introduced with PR #725
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Increate WriteDeadline test that otherwise could cause a client
connect to fail
- Check failed NumRoutes() with retry
- Check that subs are propagated in route permissions test
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The `client.perms` struct is left unchanged. We simply map Import
and Export semantics to existing Publish and Subscribe ones.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This PR is based out of #633. It imroves parsing QRSID so that the
TestRouteQueueSemantics test now passes (when dealing with malformed
QRSID).
A test similar to what is reported in #632 was also added. This
test however, uncovers a race condition that will be fixed in a
separate PR.
Resolves#632
Until now, a server would only notify clients of servers that join
the cluster. More than that, a server would send ot its clients only
information if new servers were added.
This PR changes this by sending to clients that support async INFO
the list of URLs for all servers in the cluster any time that there
is a change (joining or leaving the cluster).
As of now, clients will not be affected by the change (and will not
take benefit of this: removing servers from their server pool). This
will be addressed in each supported client once this is merged.
When the option Cluster.NoAdvertise is false, a server will send
an INFO protocol message to its client when a server has joined
the cluster.
Previously, the protocol would be sent only if the
joining server's "client URLs" (the addresses where clients connect
to) were new. It will now be sent regardless if the server joins
(for the first time) or rejoins the cluster.
Clients are still by default invoking the DiscoveredServersCB callback
only if they themselves detect that new URLs were added. A separate
PR may be filled to client libraries repo to be able to invoke
the callback anytime an async INFO protocol is received.
Based on @madgrenadier PR #597.
The RunServer() function (and the various variants)
call Server.Start() in a go-routine, but do not return until
it has verified that the server is ready to accept connections.
To do so, it use GetListenEndpoint() to get a suitable connect
address (replacing "0.0.0.0" or "::" with localhost - important
on Windows). It then creates a raw TCP connection to ensure the
server is started, repeating the process in case of failure up
to 10 seconds.
This PR replaces this with a function that checks that client
listener, and route listener if configured, are set. This removes
the need to get a connect address and create test tcp connections.
The reason for this change is that NATS Streaming when starting
the NATS Server (unless configured to connect to a remote one)
calls RunServerWithAuth(), which when getting "localhost" from
GetListenEndpoint(), would fail trying to resolve it. This happened
for the NATS Streaming Docker image built with Go 1.7+.
By default, a server is now sending to its clients the client URLs
of all servers in the cluster. This allows clients to be able
to reconnect to any server in the cluster even if those clients
were not configured with the list of servers in the cluster.
However, there may be cases where it would make sense to disable
this feature. This now can be done with this option/command line
parameter.
Resolves#322
Clients that will be at the ClientProtoInfo protocol level (or above)
will now receive an asynchronous INFO protocol when the server
they connect to adds a *new* route. This means that when the cluster
adds a new server, all clients in the cluster should now be notified
of this new addition.
Both seed and chained cases are now handled properly when servers
connect quickly and concurrently to one another.
When accepting a route, the server will forward the new route INFO
protocol to its known routes. In turn those routes will connect
to the new server (if not already connected).
A retry for implicit route was introduced to mitigate the issue
with two servers connecting to each other and electing the opposite
connection as the winner, resulting in both connections being dropped.
The server with smaller ID will try once to reconnect.
Some tests were fixed to handle possible extra INFO protocol.
New tests added.
Fix issue: https://github.com/nats-io/gnatsd/issues/206
Attempt to address issue #175.
Instead of trying to detect if route URL will point to route listen address, detects that the route remoteID is server's ID.
If so, closes the connection and stop trying.
- The raw connection used to check that the server is started now consumes the INFO and sends PING and consumes PONG before returning.
- The route test needs to make sure that the client connection has client id 2. Using PING/PONG before creating route connection to make sure of that.