The http servers for those two were recently modified to set
a ReadTimeout and WriteTimeout. The WriteTimeout specifically
caused issues for Profiling since it is common to ask sampling
of several seconds. Pprof code would reject the request if it
detected that http server's WriteTimeout was more than sampling
in request.
For monitoring, any situation that would cause the monitoring code
to take more than 2 seconds to gather information (could be due
to locking, amount of objects to return, time required for sorting,
etc..) would also cause cURL to return empty response or WebBrowser
to fail to display the page.
Resolves#600
- Get gosimple package
- Updated staticcheck's URL
- Moved build and above checks in `before_script` section to fail fast
- Fixed reports from gosimple
We use a hardcoded value of 2 seconds for Write deadline when
writing data to client's socket.
This PR makes that value configurable.
Question is should we push the setting down to the client's object
to avoid indirection such as client.srv.opts.WriteDeadline?
The RunServer() function (and the various variants)
call Server.Start() in a go-routine, but do not return until
it has verified that the server is ready to accept connections.
To do so, it use GetListenEndpoint() to get a suitable connect
address (replacing "0.0.0.0" or "::" with localhost - important
on Windows). It then creates a raw TCP connection to ensure the
server is started, repeating the process in case of failure up
to 10 seconds.
This PR replaces this with a function that checks that client
listener, and route listener if configured, are set. This removes
the need to get a connect address and create test tcp connections.
The reason for this change is that NATS Streaming when starting
the NATS Server (unless configured to connect to a remote one)
calls RunServerWithAuth(), which when getting "localhost" from
GetListenEndpoint(), would fail trying to resolve it. This happened
for the NATS Streaming Docker image built with Go 1.7+.
This will protect the server from non NATS clients (telnet, etc),
or misbehaving clients that would create the tcp connection but
block before sending the CONNECT.
The drawback is that the client may or may not receive the error
message (in my tests, it was getting only between 10%-20% of times).
Trying to use IPv6 address for the cluster host would fail.
Also, there were some unclosed channels in case of accept loop
setup failures.
Resolves#323
Clients that will be at the ClientProtoInfo protocol level (or above)
will now receive an asynchronous INFO protocol when the server
they connect to adds a *new* route. This means that when the cluster
adds a new server, all clients in the cluster should now be notified
of this new addition.
Don't check what is sent back. The point is that the client should be fully initialized at this point.
We can't ensure using metrics that the "check" connection is gone since in some tests, the server is started and clients auto-reconnect to it.
For tests that depend on the number of clients connected (such as the monitor one), have specific code for those tests.
- The raw connection used to check that the server is started now consumes the INFO and sends PING and consumes PONG before returning.
- The route test needs to make sure that the client connection has client id 2. Using PING/PONG before creating route connection to make sure of that.
* Add server.GetListenEndpoint() to return options' host and port when server is ready to accept client connections. The server can be asked to pick a random port. This function returns a string of the form "host:port" with the port selected by the net.Listen() call.
* Replace the use of server.Addr() with above function to connect to the starting server (using net.Dial) to check for success. The original issue was that, when no hostname is specified in the configuration, the server uses 0.0.0.0 for the listen address. However, server.Addr() would return "[::]", even on a machine with IPv6 disabled, which would cause the net.Dial call to fail with "network unreachable".