Commit Graph

65 Commits

Author SHA1 Message Date
Derek Collison
257b670ae2 Cleaned up logging for leafnodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-30 15:53:14 -07:00
Ivan Kozlovic
d2578f9e05 Update to connect/reconnect error reports logic
Changed the introduced new option and added a new one. The idea
is to be able to differentiate between never connected and reconnected
event. The never connected situation will be logged at first attempt
and every hour (by default, configurable).
However, once connected and if trying to reconnect, will report every
attempts by default, but this is configurable too.

These two options are supported for config reload.

Related to #1000
Related to #1001
Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-26 17:51:01 -06:00
Ivan Kozlovic
48c3f7f846 Fixed some flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 09:53:35 -06:00
Ivan Kozlovic
7272e4e317 Make the error report attempts configurable
This is a continuation of #1000. Added a configuration to specify
the number of attempts at which the repeated error is reported.
The algo is now to print only the 1st attempt and when current
attempt % <this config param> == 0.

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 16:28:48 -06:00
Ivan Kozlovic
03930ba0e4 [UPDATED] Reduce report of failed connection attempts
This applies to routes, gateways and leaf node connections.
The failed attempts will be printed at the first, after the first
minute and then every hour.
The connect/error statements now include the attempt number.

Note that in debug mode, all attempts are traced, so you may get
double trace (one for debug, one for info/error).

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 10:13:56 -06:00
Derek Collison
d7140a0fd1 Update for client rename
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-10 15:11:30 -07:00
Derek Collison
acfe372d63 Changes for rename from gnatsd -> nats-server
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-06 15:04:24 -07:00
Alexei Volkov
83aefdc714 [ADDED] Cluster tls insecure configuration
Based on @softkbot PR #913.
Removed the command line parameter, which then removes the need for Options.Cluster.TLSInsecure.
Added a test with config reload.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-11 14:48:22 -06:00
Ivan Kozlovic
42f45ce5b6 [FIXED] Possible delays in delivering messages
There is a possibility that a partial write results in data
not being sent in a timely fashion to a subscription.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-02-02 18:42:50 -07:00
Ivan Kozlovic
111e050d32 Allow service import to work with Gateways
This is not complete solution and is a bit hacky but is a start
to be able to have service import work at least in some basic
cases.

Also fixed a bug where replySub would not be removed from
connection's list of subs after delivery.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-05 20:35:43 -07:00
Ivan Kozlovic
0ba587249a Fixing setting of default gateway TLS Timeout
Moved setting to the default value in setBaselineOptions()
so that config reload does not fail.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-03 18:20:15 -07:00
Ivan Kozlovic
52c724a83c Updates based on comments
- Solve RS+ with wildcards
- Solve issue with messages not send to remote gateways queue subs
  if there was a qsub on local server.
- Made rcache a perAccountCache since it is now used by routes and
  gateways
- Order outbound gateways only on RTT updates
- Print a server's gateway name on startup
- Augment/add some tests
- Update TLS handling: when connecting, use hostname for ServerName
  if url is not IP, otherwise use a hostname that we saved when
  parsing/adding URLs for the remote gateway.
- Send big buffer in chunks if needed.
- Add caching for qsubs match

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-27 19:39:41 -07:00
Ivan Kozlovic
10fd3ca0c6 Gateways [WIP]
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-27 19:00:03 -07:00
Ivan Kozlovic
eb17950971 Introduce some delay before closing clients in LameDuck mode.
This will allow to signal multiple servers at once to go in
that mode and not have their client reconnect to one of the
servers in the group.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-08 18:34:15 -07:00
Ivan Kozlovic
1817b354e3 Update tests on Travis with tweaked GC settings
Moved some tests to "no race" tests that are run separately.
Removing -v and adding -p=1.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-08 16:56:20 -07:00
Ivan Kozlovic
c173d55e2e Update based on comments
Start the lame duck mode in a go routine in the signal handler
because I think we want to be able to shutdown the server while
in that mode.

Kept the closing as a loop in the lameDuckMode() function (did
not use a timer).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-10-22 16:27:30 -06:00
Ivan Kozlovic
0067c3bb04 Added support for lame duck mode
When receiving SIGUSR2 signal (or -sl ldm) the server stops
accepting new clients, closes routes connections and spread the
closing of client connections based on a config lame duck duration
(default is 30sec). This will help preventing a storm of client
reconnect when a server needs to be shutdown.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-10-19 19:07:37 -06:00
Derek Collison
e78d587083 Added support for maximum subscriptions per connection
Signed-off-by: Derek Collison <derek@nats.io>
2018-07-01 15:13:59 -07:00
Derek Collison
3b953ce838 Allow localhost to not be defined, only need 127.0.0.1
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-28 16:10:19 -07:00
Ivan Kozlovic
aff1dcf089 Fix some tests
Add some helpers to check on some state.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-06-27 17:26:49 -06:00
Ivan Kozlovic
0e422812cd Tune some more tests
- Increate WriteDeadline test that otherwise could cause a client
  connect to fail
- Check failed NumRoutes() with retry
- Check that subs are propagated in route permissions test

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-06-26 18:52:56 -06:00
Derek Collison
240e21ac5c Fix restart of server
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-19 22:32:50 -07:00
Derek Collison
844f376140 Performance optimizations, beta3, fixes to various tests.
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-11 15:11:03 -07:00
Derek Collison
d603c53f67 Big message optimizations, slow consumer updates
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Derek Collison
50a99241ea Slow consumer updates and latency improvements.
Use pending bytes as slow consumer trigger, so reintroduce max_pending.
Improve latency with inplace flush calls when appropriate. Utilize simple
time budget for readLoop routine.

Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Derek Collison
644376209b Added large payload pub/sub benchmark
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Ivan Kozlovic
8cb7c18204 Prepare for release 1.1.0 2018-03-23 11:53:04 -06:00
Derek Collison
00901acc78 Update license to Apache 2 2018-03-15 22:31:07 -07:00
Ivan Kozlovic
1acf330e07 [ADDED] Notification to clients when servers leave the cluster
Until now, a server would only notify clients of servers that join
the cluster. More than that, a server would send ot its clients only
information if new servers were added.
This PR changes this by sending to clients that support async INFO
the list of URLs for all servers in the cluster any time that there
is a change (joining or leaving the cluster).
As of now, clients will not be affected by the change (and will not
take benefit of this: removing servers from their server pool). This
will be addressed in each supported client once this is merged.
2018-02-27 14:22:13 -07:00
Ivan Kozlovic
acf4a31e4b Major updates + support for config reload of client/cluster advertise 2018-02-05 20:15:36 -07:00
Peter Miron
306a3f9507 Resolving Ivan's feedback. 2017-11-29 15:35:05 -05:00
Peter Miron
4829592107 removed support for array of Advertise addresses. Added support for Route advertise address. 2017-11-29 11:41:08 -05:00
Peter Miron
7d34b890c6 Takes list of client connect addresses. Uses the first as the host / port sent on info. 2017-11-28 09:55:35 -05:00
Peter Miron
6852298e7b draft of fix for issue #447. allows advertising separate host:ports to client. 2017-11-27 15:34:15 -05:00
Ivan Kozlovic
22cff99e58 [Fixed] Profiling and Monitoring timeout issues
The http servers for those two were recently modified to set
a ReadTimeout and WriteTimeout. The WriteTimeout specifically
caused issues for Profiling since it is common to ask sampling
of several seconds. Pprof code would reject the request if it
detected that http server's WriteTimeout was more than sampling
in request.
For monitoring, any situation that would cause the monitoring code
to take more than 2 seconds to gather information (could be due
to locking, amount of objects to return, time required for sorting,
etc..) would also cause cURL to return empty response or WebBrowser
to fail to display the page.

Resolves #600
2017-11-08 14:58:10 -07:00
Christophe de Vienne
28975f67bb Custom auth: Test both auth success & failure 2017-09-08 17:43:32 +02:00
Christophe de Vienne
b743f91107 Use GetOpts in CustomAuth tests
It increses code coverage, and makes sure GetOpts() behaves properly.
2017-09-08 11:29:26 +02:00
Christophe de Vienne
e556854f54 Rename Custom*Auth to Custom*Authentication
Simplify and complete tests based on Ivan advice.
2017-09-08 10:54:20 +02:00
Christophe de Vienne
e366137b6c Add a test for CustomClientAuth 2017-09-07 17:59:59 +02:00
Tyler Treat
6e3f58f48c Fix tests and cleanup log output 2017-06-14 10:38:21 -05:00
Peter Miron
00744ff426 converted MonitorAddr and ClusterAddr to *net.TCPAddr 2017-06-12 17:40:36 -04:00
Peter Miron
d1f38f38a2 changes to support random ports for clusters and profiler. 2017-06-10 10:35:01 -04:00
Peter Miron
f2a9cc8cb0 fixed go fmt'ing 2017-06-08 11:37:23 -04:00
Peter Miron
43a3f1ef1d cleaned up naming to MonitorAddr instead of HttpPort (as it could be either Http or Https). added test for nil to improve coverage. 2017-06-08 10:46:55 -04:00
Peter Miron
41aa44cd8d Added ability to use random ports to limit unit test port contention. 2017-06-08 10:19:56 -04:00
Ivan Kozlovic
1fb9f211ca Added gosimple
- Get gosimple package
- Updated staticcheck's URL
- Moved build and above checks in `before_script` section to fail fast
- Fixed reports from gosimple
2017-01-25 13:30:11 -07:00
Ivan Kozlovic
95d0152449 [ADDED] Make Write deadline configurable
We use a hardcoded value of 2 seconds for Write deadline when
writing data to client's socket.
This PR makes that value configurable.

Question is should we push the setting down to the client's object
to avoid indirection such as client.srv.opts.WriteDeadline?
2017-01-18 20:33:44 -07:00
Ivan Kozlovic
5f471b6e7f Replace GetListenEndpoint() with ReadyForConnections()
The RunServer() function (and the various variants)
call Server.Start() in a go-routine, but do not return until
it has verified that the server is ready to accept connections.
To do so, it use GetListenEndpoint() to get a suitable connect
address (replacing "0.0.0.0" or "::" with localhost - important
on Windows). It then creates a raw TCP connection to ensure the
server is started, repeating the process in case of failure up
to 10 seconds.

This PR replaces this with a function that checks that client
listener, and route listener if configured, are set. This removes
the need to get a connect address and create test tcp connections.

The reason for this change is that NATS Streaming when starting
the NATS Server (unless configured to connect to a remote one)
calls RunServerWithAuth(), which when getting "localhost" from
GetListenEndpoint(), would fail trying to resolve it. This happened
for the NATS Streaming Docker image built with Go 1.7+.
2016-12-09 14:03:45 -07:00
Derek Collison
8fbacaaea1 Cleanup for cluster opts 2016-12-02 14:29:22 -08:00
Waldemar Quevedo
ff2d6d1983 Add function and test for processing sub command args 2016-12-01 18:18:52 -08:00