Commit Graph

83 Commits

Author SHA1 Message Date
Ivan Kozlovic
98ea70a590 LameDuckMode takes into account websocket accept loop
This is related to #1408.
Make sure that we close the websocket "accept loop" if configured
before proceeding with the lame duck mode.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-02 17:49:38 -06:00
Derek Collison
05e38ae527 Merge branch 'master' into sys-acc 2020-06-01 11:53:14 -07:00
Derek Collison
2bd7553c71 System Account on by default.
Most of the changes are to turn it off for tests that were watching subscriptions and such.

Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 17:56:45 -07:00
Ivan Kozlovic
44e78a1fb6 Fixed some tests
- A race test may have consumed a lot of fds going in TIME_WAIT
that could cause some issues for other tests
- Missing defer filestore.Stop() that would leave flushLoop()
routines
- A defer for the from server in a LeafNode test
- Rework [Re]ConnectErrorReports that was failing often for me
locally (probably due to exhaustion of fds - too many TIME_WAIT).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-29 17:47:08 -06:00
Ivan Kozlovic
e9805a3109 [FIXED] Possible removal of interest on queue subs with leaf nodes
Server was incorrectly processing a queue subscription removal
as both a plain sub and queue sub, which may have resulted in
drop of interest even when some queue subs remained.

Resolves #1421

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-28 10:21:51 -06:00
Ivan Kozlovic
8678a61e3e Move the send of INFO after client listener has been shutdown
This will ensure that there is no race where clients are accepted
after the LDM INFO notification.

Also add to the test to make sure that we don't send INFO when
routes are disconnected due to internal closing of connections
during the shutdown process.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-24 11:38:49 -06:00
Ivan Kozlovic
dc0f688cbf [FIXED] LameDuckMode sends INFO to clients
Also send an INFO to routes so that the remotes can remove the
LDM's server client URLs and notify their own clients of this
change.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-21 12:15:20 -06:00
Ivan Kozlovic
d1276ad038 Add TLS 1.3 (and new ciphers) in the tlsVersion output
Also changed unknown version to "0x.." to show that value is hexa.

Resolves #1313

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-03-18 10:09:23 -06:00
Ivan Kozlovic
5eebf02e5f Fixed TestVersionMatchesTag test
When no tag was set, os.Getenv("TRAVIS_TAG") would return empty string.
Travis now set TRAVIS_TAG to `''`. So check for both.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-03-09 10:13:32 -06:00
Ivan Kozlovic
156bf7b381 Updates based on code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-02-19 16:52:41 -07:00
Ivan Kozlovic
8e4b449119 Fixed flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-02-19 13:19:08 -07:00
Ivan Kozlovic
1b2754475b Refactor async client tests
Updated all tests that use "async" clients.
- start the writeLoop (this is in preparation for changes in the
  server that will not do send-in-place for some protocols, such
  as PING, etc..)
- Added missing defers in several tests
- fixed an issue in client.go where test was wrong possibly causing
  a panic.
- Had to skip a test for now since it would fail without server code
  change.

The next step will be ensure that all protocols are sent through
the writeLoop and that the data is properly flushed on close (important
for -ERR for instance).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-12-12 11:58:24 -07:00
Ivan Kozlovic
63138509f7 Tune some code/test for Windows
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.

Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-11-21 14:32:46 -07:00
R.I.Pienaar
bcf96fa1de Allows a descriptive server_name to be set
This adds a new config option server_name that
when set will be exposed in varz, events and more
as a descriptive name for the server.

If unset though the server_name will default to the pk

Signed-off-by: R.I.Pienaar <rip@devco.net>
2019-10-17 18:51:19 +02:00
Ivan Kozlovic
0a72993d80 Add warning for TLS insecure setting on LeafNodes
Also fix for #1071 in that we need to check remote gateways TLS
config even if main gateway section is not configured with TLS.

Related to #1071

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-12 17:22:57 -06:00
Ivan Kozlovic
37d08a6c56 [FIXED] Allow TLS InsecureSkipVerify again
This has an effect only on connections created by the server,
so routes and gateways (explicit and implicit).
Make sure that an explicit warning is printed if the insecure
property is set, but otherwise allow it.

Resolves #1062

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-12 12:10:28 -06:00
Derek Collison
10d4f1ab7a Convert leafnode solicited remotes to array
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-10 11:53:34 -07:00
Derek Collison
0f20592fb3 Made leafnode connect a Debugf to be consistent, added first connect Noticef.
Signed-off-by: Derek Collison <derek@nats.io>
2019-06-29 19:11:02 -07:00
Derek Collison
257b670ae2 Cleaned up logging for leafnodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-30 15:53:14 -07:00
Ivan Kozlovic
d2578f9e05 Update to connect/reconnect error reports logic
Changed the introduced new option and added a new one. The idea
is to be able to differentiate between never connected and reconnected
event. The never connected situation will be logged at first attempt
and every hour (by default, configurable).
However, once connected and if trying to reconnect, will report every
attempts by default, but this is configurable too.

These two options are supported for config reload.

Related to #1000
Related to #1001
Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-26 17:51:01 -06:00
Ivan Kozlovic
48c3f7f846 Fixed some flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 09:53:35 -06:00
Ivan Kozlovic
7272e4e317 Make the error report attempts configurable
This is a continuation of #1000. Added a configuration to specify
the number of attempts at which the repeated error is reported.
The algo is now to print only the 1st attempt and when current
attempt % <this config param> == 0.

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 16:28:48 -06:00
Ivan Kozlovic
03930ba0e4 [UPDATED] Reduce report of failed connection attempts
This applies to routes, gateways and leaf node connections.
The failed attempts will be printed at the first, after the first
minute and then every hour.
The connect/error statements now include the attempt number.

Note that in debug mode, all attempts are traced, so you may get
double trace (one for debug, one for info/error).

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 10:13:56 -06:00
Derek Collison
d7140a0fd1 Update for client rename
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-10 15:11:30 -07:00
Derek Collison
acfe372d63 Changes for rename from gnatsd -> nats-server
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-06 15:04:24 -07:00
Alexei Volkov
83aefdc714 [ADDED] Cluster tls insecure configuration
Based on @softkbot PR #913.
Removed the command line parameter, which then removes the need for Options.Cluster.TLSInsecure.
Added a test with config reload.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-11 14:48:22 -06:00
Ivan Kozlovic
42f45ce5b6 [FIXED] Possible delays in delivering messages
There is a possibility that a partial write results in data
not being sent in a timely fashion to a subscription.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-02-02 18:42:50 -07:00
Ivan Kozlovic
111e050d32 Allow service import to work with Gateways
This is not complete solution and is a bit hacky but is a start
to be able to have service import work at least in some basic
cases.

Also fixed a bug where replySub would not be removed from
connection's list of subs after delivery.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-05 20:35:43 -07:00
Ivan Kozlovic
0ba587249a Fixing setting of default gateway TLS Timeout
Moved setting to the default value in setBaselineOptions()
so that config reload does not fail.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-03 18:20:15 -07:00
Ivan Kozlovic
52c724a83c Updates based on comments
- Solve RS+ with wildcards
- Solve issue with messages not send to remote gateways queue subs
  if there was a qsub on local server.
- Made rcache a perAccountCache since it is now used by routes and
  gateways
- Order outbound gateways only on RTT updates
- Print a server's gateway name on startup
- Augment/add some tests
- Update TLS handling: when connecting, use hostname for ServerName
  if url is not IP, otherwise use a hostname that we saved when
  parsing/adding URLs for the remote gateway.
- Send big buffer in chunks if needed.
- Add caching for qsubs match

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-27 19:39:41 -07:00
Ivan Kozlovic
10fd3ca0c6 Gateways [WIP]
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-27 19:00:03 -07:00
Ivan Kozlovic
eb17950971 Introduce some delay before closing clients in LameDuck mode.
This will allow to signal multiple servers at once to go in
that mode and not have their client reconnect to one of the
servers in the group.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-08 18:34:15 -07:00
Ivan Kozlovic
1817b354e3 Update tests on Travis with tweaked GC settings
Moved some tests to "no race" tests that are run separately.
Removing -v and adding -p=1.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-08 16:56:20 -07:00
Ivan Kozlovic
c173d55e2e Update based on comments
Start the lame duck mode in a go routine in the signal handler
because I think we want to be able to shutdown the server while
in that mode.

Kept the closing as a loop in the lameDuckMode() function (did
not use a timer).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-10-22 16:27:30 -06:00
Ivan Kozlovic
0067c3bb04 Added support for lame duck mode
When receiving SIGUSR2 signal (or -sl ldm) the server stops
accepting new clients, closes routes connections and spread the
closing of client connections based on a config lame duck duration
(default is 30sec). This will help preventing a storm of client
reconnect when a server needs to be shutdown.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-10-19 19:07:37 -06:00
Derek Collison
e78d587083 Added support for maximum subscriptions per connection
Signed-off-by: Derek Collison <derek@nats.io>
2018-07-01 15:13:59 -07:00
Derek Collison
3b953ce838 Allow localhost to not be defined, only need 127.0.0.1
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-28 16:10:19 -07:00
Ivan Kozlovic
aff1dcf089 Fix some tests
Add some helpers to check on some state.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-06-27 17:26:49 -06:00
Ivan Kozlovic
0e422812cd Tune some more tests
- Increate WriteDeadline test that otherwise could cause a client
  connect to fail
- Check failed NumRoutes() with retry
- Check that subs are propagated in route permissions test

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-06-26 18:52:56 -06:00
Derek Collison
240e21ac5c Fix restart of server
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-19 22:32:50 -07:00
Derek Collison
844f376140 Performance optimizations, beta3, fixes to various tests.
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-11 15:11:03 -07:00
Derek Collison
d603c53f67 Big message optimizations, slow consumer updates
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Derek Collison
50a99241ea Slow consumer updates and latency improvements.
Use pending bytes as slow consumer trigger, so reintroduce max_pending.
Improve latency with inplace flush calls when appropriate. Utilize simple
time budget for readLoop routine.

Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Derek Collison
644376209b Added large payload pub/sub benchmark
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Ivan Kozlovic
8cb7c18204 Prepare for release 1.1.0 2018-03-23 11:53:04 -06:00
Derek Collison
00901acc78 Update license to Apache 2 2018-03-15 22:31:07 -07:00
Ivan Kozlovic
1acf330e07 [ADDED] Notification to clients when servers leave the cluster
Until now, a server would only notify clients of servers that join
the cluster. More than that, a server would send ot its clients only
information if new servers were added.
This PR changes this by sending to clients that support async INFO
the list of URLs for all servers in the cluster any time that there
is a change (joining or leaving the cluster).
As of now, clients will not be affected by the change (and will not
take benefit of this: removing servers from their server pool). This
will be addressed in each supported client once this is merged.
2018-02-27 14:22:13 -07:00
Ivan Kozlovic
acf4a31e4b Major updates + support for config reload of client/cluster advertise 2018-02-05 20:15:36 -07:00
Peter Miron
306a3f9507 Resolving Ivan's feedback. 2017-11-29 15:35:05 -05:00
Peter Miron
4829592107 removed support for array of Advertise addresses. Added support for Route advertise address. 2017-11-29 11:41:08 -05:00