Based on @softkbot PR #913.
Removed the command line parameter, which then removes the need for Options.Cluster.TLSInsecure.
Added a test with config reload.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
There is a possibility that a partial write results in data
not being sent in a timely fashion to a subscription.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is not complete solution and is a bit hacky but is a start
to be able to have service import work at least in some basic
cases.
Also fixed a bug where replySub would not be removed from
connection's list of subs after delivery.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Solve RS+ with wildcards
- Solve issue with messages not send to remote gateways queue subs
if there was a qsub on local server.
- Made rcache a perAccountCache since it is now used by routes and
gateways
- Order outbound gateways only on RTT updates
- Print a server's gateway name on startup
- Augment/add some tests
- Update TLS handling: when connecting, use hostname for ServerName
if url is not IP, otherwise use a hostname that we saved when
parsing/adding URLs for the remote gateway.
- Send big buffer in chunks if needed.
- Add caching for qsubs match
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This will allow to signal multiple servers at once to go in
that mode and not have their client reconnect to one of the
servers in the group.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Start the lame duck mode in a go routine in the signal handler
because I think we want to be able to shutdown the server while
in that mode.
Kept the closing as a loop in the lameDuckMode() function (did
not use a timer).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When receiving SIGUSR2 signal (or -sl ldm) the server stops
accepting new clients, closes routes connections and spread the
closing of client connections based on a config lame duck duration
(default is 30sec). This will help preventing a storm of client
reconnect when a server needs to be shutdown.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Increate WriteDeadline test that otherwise could cause a client
connect to fail
- Check failed NumRoutes() with retry
- Check that subs are propagated in route permissions test
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Use pending bytes as slow consumer trigger, so reintroduce max_pending.
Improve latency with inplace flush calls when appropriate. Utilize simple
time budget for readLoop routine.
Signed-off-by: Derek Collison <derek@nats.io>
Until now, a server would only notify clients of servers that join
the cluster. More than that, a server would send ot its clients only
information if new servers were added.
This PR changes this by sending to clients that support async INFO
the list of URLs for all servers in the cluster any time that there
is a change (joining or leaving the cluster).
As of now, clients will not be affected by the change (and will not
take benefit of this: removing servers from their server pool). This
will be addressed in each supported client once this is merged.
The http servers for those two were recently modified to set
a ReadTimeout and WriteTimeout. The WriteTimeout specifically
caused issues for Profiling since it is common to ask sampling
of several seconds. Pprof code would reject the request if it
detected that http server's WriteTimeout was more than sampling
in request.
For monitoring, any situation that would cause the monitoring code
to take more than 2 seconds to gather information (could be due
to locking, amount of objects to return, time required for sorting,
etc..) would also cause cURL to return empty response or WebBrowser
to fail to display the page.
Resolves#600
- Get gosimple package
- Updated staticcheck's URL
- Moved build and above checks in `before_script` section to fail fast
- Fixed reports from gosimple
We use a hardcoded value of 2 seconds for Write deadline when
writing data to client's socket.
This PR makes that value configurable.
Question is should we push the setting down to the client's object
to avoid indirection such as client.srv.opts.WriteDeadline?
The RunServer() function (and the various variants)
call Server.Start() in a go-routine, but do not return until
it has verified that the server is ready to accept connections.
To do so, it use GetListenEndpoint() to get a suitable connect
address (replacing "0.0.0.0" or "::" with localhost - important
on Windows). It then creates a raw TCP connection to ensure the
server is started, repeating the process in case of failure up
to 10 seconds.
This PR replaces this with a function that checks that client
listener, and route listener if configured, are set. This removes
the need to get a connect address and create test tcp connections.
The reason for this change is that NATS Streaming when starting
the NATS Server (unless configured to connect to a remote one)
calls RunServerWithAuth(), which when getting "localhost" from
GetListenEndpoint(), would fail trying to resolve it. This happened
for the NATS Streaming Docker image built with Go 1.7+.
This will protect the server from non NATS clients (telnet, etc),
or misbehaving clients that would create the tcp connection but
block before sending the CONNECT.
The drawback is that the client may or may not receive the error
message (in my tests, it was getting only between 10%-20% of times).
Trying to use IPv6 address for the cluster host would fail.
Also, there were some unclosed channels in case of accept loop
setup failures.
Resolves#323
Clients that will be at the ClientProtoInfo protocol level (or above)
will now receive an asynchronous INFO protocol when the server
they connect to adds a *new* route. This means that when the cluster
adds a new server, all clients in the cluster should now be notified
of this new addition.