When a server is told to connect to a server (with auto-discovery),
it tries to connect once. There have been a report where that
connection fails, but would probably succeed if tried again (#408).
This new parameter allows to configure the number of times a failed
implicit connect should be tried.
Resolves#408
The RunServer() function (and the various variants)
call Server.Start() in a go-routine, but do not return until
it has verified that the server is ready to accept connections.
To do so, it use GetListenEndpoint() to get a suitable connect
address (replacing "0.0.0.0" or "::" with localhost - important
on Windows). It then creates a raw TCP connection to ensure the
server is started, repeating the process in case of failure up
to 10 seconds.
This PR replaces this with a function that checks that client
listener, and route listener if configured, are set. This removes
the need to get a connect address and create test tcp connections.
The reason for this change is that NATS Streaming when starting
the NATS Server (unless configured to connect to a remote one)
calls RunServerWithAuth(), which when getting "localhost" from
GetListenEndpoint(), would fail trying to resolve it. This happened
for the NATS Streaming Docker image built with Go 1.7+.
Running `go vet ./...` with `go 1.7.3` would report the following:
```
server/route.go:342: assignment copies lock value to tlsConfig: crypto/tls.Config contains sync.Once contains sync.Mutex
server/server.go:479: assignment copies lock value to config: crypto/tls.Config contains sync.Once contains sync.Mutex
```
Add a “clone” function while waiting for this to be addressed
by the language itself (https://go-review.googlesource.com/#/c/28075/)
By default, a server is now sending to its clients the client URLs
of all servers in the cluster. This allows clients to be able
to reconnect to any server in the cluster even if those clients
were not configured with the list of servers in the cluster.
However, there may be cases where it would make sense to disable
this feature. This now can be done with this option/command line
parameter.
Resolves#322
Trying to use IPv6 address for the cluster host would fail.
Also, there were some unclosed channels in case of accept loop
setup failures.
Resolves#323
Clients that will be at the ClientProtoInfo protocol level (or above)
will now receive an asynchronous INFO protocol when the server
they connect to adds a *new* route. This means that when the cluster
adds a new server, all clients in the cluster should now be notified
of this new addition.
Ensure that all socket writes are protected with deadlines.
For connection Close(), also use deadlines since in case of TLS,
the Close() will send an alert (do a write) if the handshake was
completed. If the peer is not reading, this would cause the Close()
to hang.
Refactor the way client is initialized. We need to ensure that
clients are not added to the clients map and readLoop started if
the server is in the process of being shutdown otherwise there
is a chance that the server already gathered the list of connections
to close and this one would not be included, leaving a readLoop
running.
Same occurs for routes, with the complexity that the readLoop is
started well before the route connection is added to the server
routes' list. We need a temporary map that contains those connections
to be able to close them on server Shutdown.
Fixed some flapping tests.
We need to make sure that when Shutdown() returns, routes go routines
that try to connect or reconnect have returned. Otherwise, this may
affect tests running one after the other (a server from one test
may connect to a server in the next test).
-No need to store ip url string in c.route and resolve remote IP
when forwarding the INFO to known servers.
-When checking if a route is explicit, use strings.ToLower() once
for the url being checked.
Both seed and chained cases are now handled properly when servers
connect quickly and concurrently to one another.
When accepting a route, the server will forward the new route INFO
protocol to its known routes. In turn those routes will connect
to the new server (if not already connected).
A retry for implicit route was introduced to mitigate the issue
with two servers connecting to each other and electing the opposite
connection as the winner, resulting in both connections being dropped.
The server with smaller ID will try once to reconnect.
Some tests were fixed to handle possible extra INFO protocol.
New tests added.
Fix issue: https://github.com/nats-io/gnatsd/issues/206
Attempt to address issue #175.
Instead of trying to detect if route URL will point to route listen address, detects that the route remoteID is server's ID.
If so, closes the connection and stop trying.
Without the server fix, tls_test.go would likely report an error. The server would show a parser error with protocol snippet containing "random" bytes, likely encrypted data.
If processRoute executes before createRoute finishes registering the route, and before a subscriber has connected (or reconnected), and the subscriber connects (or reconnects) before the route is registered, then this subscription will stay local and not be forwarded to the route.
When message is split we need to copy message arguments to avoid
rewriting them with new message. Subject, reply and size were correctly
copied by sid wasn't. That led to dropping some messages in clustered
mode if they were split, and the second part was long enough to overwrite
sid in the original buffer.
Govet doesn't like functions that look like format handlers not ending in `f`,
such as Debug() vs Debugf(). This is changing some of the new log handling to
use 'f' function names to appease govet.
Updated the implicit handling of including the client as an arg without being
used in a format string. Now the client object simply has a Errorf function
for logging errors and it adds itself onto the format string.