nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-02 03:38:42 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	2ad2bed170	[ADDED] Support for route hostname resolution We previously simply called DialTimeout() on a route's url when soliciting. If it resolved to the IP of the host, it would create a route to self, which server detects, but then would not try again with other IPs that would have allowed to form a cluster with other servers running on the other IPs. This PR keeps track of local IPs + cluster port and exclude them from the list of IPs returned by LookupHost API. This even prevent solicitation of routes to self. Only non-local IPs will be tried. Resolves #1586 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-09-08 13:40:17 -06:00
Ivan Kozlovic	20a67a5be8	Websocket: add option to disable TLS The new option Websocket.NoTLS would have to be set to true to disable the server check that enforces TLS configuration. Resolves #1529 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-29 17:33:02 -06:00
Ivan Kozlovic	9b0967a5d1	[FIXED] Handling of gossiped URLs If some servers in the cluster have the same connect URLs (due to the use of client advertise), then it would be possible to have a server sends the connect_urls INFO update to clients with missing URLs. Resolves #1515 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-15 17:39:12 -06:00
Ivan Kozlovic	9288283d90	Fixed accept loops that could leave connections opened This was discovered with the test TestLeafNodeWithGatewaysServerRestart that was sometimes failing. Investigation showed that when cluster B was shutdown, one of the server on A that had a connection from B that just broke tried to reconnect (as part of reconnect retries of implicit gateways) to a server in B that was in the process of shuting down. The connection had been accepted but createGateway not called because the server's running boolean had been set to false as part of the shutdown. However, the connection was not closed so the server on A had a valid connection to a dead server from cluster B. When the B cluster (now single server) was restarted and a LeafNode connection connected to it, then the gateway from B to A was created, that server on A did not create outbound connection to that B server because it already had one (the zombie one). So this PR strengthens the starting of accept loops and also make sure that if a connection (all type of connections) is not accepted because the server is shuting down, that connection is properly closed. Since all accept loops had almost same code, made a generic function that accept functions to call specific create connection functions. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-06 17:03:19 -06:00
Ivan Kozlovic	27540ee255	Fixed some flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-03 11:30:48 -06:00
Derek Collison	2b9e3e5b15	Merge pull request #1476 from nats-io/cluster_name Cluster names are now required.	2020-06-15 10:07:30 -07:00
Derek Collison	146d8f5dcb	Updates based on feedback, sped up some slow tests Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 17:26:43 -07:00
Derek Collison	dd61535e5a	Cluster names are now required. Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 15:48:38 -07:00
Ivan Kozlovic	67d2638859	[ADDED] Print the config file being used in startup banner Resolves #1451 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-12 12:21:50 -06:00
Ivan Kozlovic	b9bd5c2d35	Fixed flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-09 15:34:52 -06:00
Ivan Kozlovic	cd6d71deaa	[ADDED] lame_duck_grace_period option The grace period used to be hardcoded at 10 seconds. This option allows the user to configure the amount of time the server will wait before initiating the closing of client connections. Note that the grace period needs to be strictly lower than the overall lame_duck_duration. The server deducts the grace period from that overall duration and spreads the closing of connections during that time. For instance, if there are 1000 connections and the lame duck duration is set to 30 seconds and grace period to 10, then the server will use 30-10 = 20 seconds to spread the closing of those 1000 connections, so say roughly 50 clients per second. Resolves #1459. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-08 11:43:25 -06:00
Ivan Kozlovic	98ea70a590	LameDuckMode takes into account websocket accept loop This is related to #1408. Make sure that we close the websocket "accept loop" if configured before proceeding with the lame duck mode. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-02 17:49:38 -06:00
Derek Collison	05e38ae527	Merge branch 'master' into sys-acc	2020-06-01 11:53:14 -07:00
Derek Collison	2bd7553c71	System Account on by default. Most of the changes are to turn it off for tests that were watching subscriptions and such. Signed-off-by: Derek Collison <derek@nats.io>	2020-05-29 17:56:45 -07:00
Ivan Kozlovic	44e78a1fb6	Fixed some tests - A race test may have consumed a lot of fds going in TIME_WAIT that could cause some issues for other tests - Missing defer filestore.Stop() that would leave flushLoop() routines - A defer for the from server in a LeafNode test - Rework [Re]ConnectErrorReports that was failing often for me locally (probably due to exhaustion of fds - too many TIME_WAIT). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-29 17:47:08 -06:00
Ivan Kozlovic	e9805a3109	[FIXED] Possible removal of interest on queue subs with leaf nodes Server was incorrectly processing a queue subscription removal as both a plain sub and queue sub, which may have resulted in drop of interest even when some queue subs remained. Resolves #1421 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-28 10:21:51 -06:00
Ivan Kozlovic	8678a61e3e	Move the send of INFO after client listener has been shutdown This will ensure that there is no race where clients are accepted after the LDM INFO notification. Also add to the test to make sure that we don't send INFO when routes are disconnected due to internal closing of connections during the shutdown process. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-24 11:38:49 -06:00
Ivan Kozlovic	dc0f688cbf	[FIXED] LameDuckMode sends INFO to clients Also send an INFO to routes so that the remotes can remove the LDM's server client URLs and notify their own clients of this change. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-21 12:15:20 -06:00
Ivan Kozlovic	d1276ad038	Add TLS 1.3 (and new ciphers) in the tlsVersion output Also changed unknown version to "0x.." to show that value is hexa. Resolves #1313 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-03-18 10:09:23 -06:00
Ivan Kozlovic	5eebf02e5f	Fixed TestVersionMatchesTag test When no tag was set, os.Getenv("TRAVIS_TAG") would return empty string. Travis now set TRAVIS_TAG to `''`. So check for both. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-03-09 10:13:32 -06:00
Ivan Kozlovic	156bf7b381	Updates based on code review Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-02-19 16:52:41 -07:00
Ivan Kozlovic	8e4b449119	Fixed flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-02-19 13:19:08 -07:00
Ivan Kozlovic	1b2754475b	Refactor async client tests Updated all tests that use "async" clients. - start the writeLoop (this is in preparation for changes in the server that will not do send-in-place for some protocols, such as PING, etc..) - Added missing defers in several tests - fixed an issue in client.go where test was wrong possibly causing a panic. - Had to skip a test for now since it would fail without server code change. The next step will be ensure that all protocols are sent through the writeLoop and that the data is properly flushed on close (important for -ERR for instance). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-12 11:58:24 -07:00
Ivan Kozlovic	63138509f7	Tune some code/test for Windows Running test suite on a Windows VM, I notice several failures. Updated the compute of the RTT to be at least 1ns. I think that this is just an issue with the VM I am running, but that change will have no impact for normal situations (since setting the rtt to the very minimum duration (1ns) instead of 0) and will prevent some tests from failing. Because of those same timer granularity issues, I had to add some delays between some actions in order for time.Sub()/Since() to actually report something more than 0. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-21 14:32:46 -07:00
R.I.Pienaar	bcf96fa1de	Allows a descriptive server_name to be set This adds a new config option server_name that when set will be exposed in varz, events and more as a descriptive name for the server. If unset though the server_name will default to the pk Signed-off-by: R.I.Pienaar <rip@devco.net>	2019-10-17 18:51:19 +02:00
Ivan Kozlovic	0a72993d80	Add warning for TLS insecure setting on LeafNodes Also fix for #1071 in that we need to check remote gateways TLS config even if main gateway section is not configured with TLS. Related to #1071 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-07-12 17:22:57 -06:00
Ivan Kozlovic	37d08a6c56	[FIXED] Allow TLS InsecureSkipVerify again This has an effect only on connections created by the server, so routes and gateways (explicit and implicit). Make sure that an explicit warning is printed if the insecure property is set, but otherwise allow it. Resolves #1062 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-07-12 12:10:28 -06:00
Derek Collison	10d4f1ab7a	Convert leafnode solicited remotes to array Signed-off-by: Derek Collison <derek@nats.io>	2019-07-10 11:53:34 -07:00
Derek Collison	0f20592fb3	Made leafnode connect a Debugf to be consistent, added first connect Noticef. Signed-off-by: Derek Collison <derek@nats.io>	2019-06-29 19:11:02 -07:00
Derek Collison	257b670ae2	Cleaned up logging for leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2019-05-30 15:53:14 -07:00
Ivan Kozlovic	d2578f9e05	Update to connect/reconnect error reports logic Changed the introduced new option and added a new one. The idea is to be able to differentiate between never connected and reconnected event. The never connected situation will be logged at first attempt and every hour (by default, configurable). However, once connected and if trying to reconnect, will report every attempts by default, but this is configurable too. These two options are supported for config reload. Related to #1000 Related to #1001 Resolves #969 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-26 17:51:01 -06:00
Ivan Kozlovic	48c3f7f846	Fixed some flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-24 09:53:35 -06:00
Ivan Kozlovic	7272e4e317	Make the error report attempts configurable This is a continuation of #1000. Added a configuration to specify the number of attempts at which the repeated error is reported. The algo is now to print only the 1st attempt and when current attempt % <this config param> == 0. Resolves #969 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-20 16:28:48 -06:00
Ivan Kozlovic	03930ba0e4	[UPDATED] Reduce report of failed connection attempts This applies to routes, gateways and leaf node connections. The failed attempts will be printed at the first, after the first minute and then every hour. The connect/error statements now include the attempt number. Note that in debug mode, all attempts are traced, so you may get double trace (one for debug, one for info/error). Resolves #969 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-20 10:13:56 -06:00
Derek Collison	d7140a0fd1	Update for client rename Signed-off-by: Derek Collison <derek@nats.io>	2019-05-10 15:11:30 -07:00
Derek Collison	acfe372d63	Changes for rename from gnatsd -> nats-server Signed-off-by: Derek Collison <derek@nats.io>	2019-05-06 15:04:24 -07:00
Alexei Volkov	83aefdc714	[ADDED] Cluster tls insecure configuration Based on @softkbot PR #913. Removed the command line parameter, which then removes the need for Options.Cluster.TLSInsecure. Added a test with config reload. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-03-11 14:48:22 -06:00
Ivan Kozlovic	42f45ce5b6	[FIXED] Possible delays in delivering messages There is a possibility that a partial write results in data not being sent in a timely fashion to a subscription. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-02-02 18:42:50 -07:00
Ivan Kozlovic	111e050d32	Allow service import to work with Gateways This is not complete solution and is a bit hacky but is a start to be able to have service import work at least in some basic cases. Also fixed a bug where replySub would not be removed from connection's list of subs after delivery. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-12-05 20:35:43 -07:00
Ivan Kozlovic	0ba587249a	Fixing setting of default gateway TLS Timeout Moved setting to the default value in setBaselineOptions() so that config reload does not fail. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-12-03 18:20:15 -07:00
Ivan Kozlovic	52c724a83c	Updates based on comments - Solve RS+ with wildcards - Solve issue with messages not send to remote gateways queue subs if there was a qsub on local server. - Made rcache a perAccountCache since it is now used by routes and gateways - Order outbound gateways only on RTT updates - Print a server's gateway name on startup - Augment/add some tests - Update TLS handling: when connecting, use hostname for ServerName if url is not IP, otherwise use a hostname that we saved when parsing/adding URLs for the remote gateway. - Send big buffer in chunks if needed. - Add caching for qsubs match Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-11-27 19:39:41 -07:00
Ivan Kozlovic	10fd3ca0c6	Gateways [WIP] Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-11-27 19:00:03 -07:00
Ivan Kozlovic	eb17950971	Introduce some delay before closing clients in LameDuck mode. This will allow to signal multiple servers at once to go in that mode and not have their client reconnect to one of the servers in the group. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-11-08 18:34:15 -07:00
Ivan Kozlovic	1817b354e3	Update tests on Travis with tweaked GC settings Moved some tests to "no race" tests that are run separately. Removing -v and adding -p=1. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-11-08 16:56:20 -07:00
Ivan Kozlovic	c173d55e2e	Update based on comments Start the lame duck mode in a go routine in the signal handler because I think we want to be able to shutdown the server while in that mode. Kept the closing as a loop in the lameDuckMode() function (did not use a timer). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-10-22 16:27:30 -06:00
Ivan Kozlovic	0067c3bb04	Added support for lame duck mode When receiving SIGUSR2 signal (or -sl ldm) the server stops accepting new clients, closes routes connections and spread the closing of client connections based on a config lame duck duration (default is 30sec). This will help preventing a storm of client reconnect when a server needs to be shutdown. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-10-19 19:07:37 -06:00
Derek Collison	e78d587083	Added support for maximum subscriptions per connection Signed-off-by: Derek Collison <derek@nats.io>	2018-07-01 15:13:59 -07:00
Derek Collison	3b953ce838	Allow localhost to not be defined, only need 127.0.0.1 Signed-off-by: Derek Collison <derek@nats.io>	2018-06-28 16:10:19 -07:00
Ivan Kozlovic	aff1dcf089	Fix some tests Add some helpers to check on some state. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-06-27 17:26:49 -06:00
Ivan Kozlovic	0e422812cd	Tune some more tests - Increate WriteDeadline test that otherwise could cause a client connect to fail - Check failed NumRoutes() with retry - Check that subs are propagated in route permissions test Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-06-26 18:52:56 -06:00

1 2

94 Commits