nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-17 03:24:40 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	4865dc7ae3	[CHANGED] Check that max_payload is not greater than max_pending This is related to PR #2407. Since the 64MB pending size is actually configurable, we should fail only if max_payload is greater than the configured max_pending. This is done in validateOptions() which covers both config file and direct options in embedded cases. The check in opts.go is reverted to max int32 since at this point we don't know if/what max_pending will be, so we simply check that it is not more than a int32. For the next minor release, we could have another change that imposes a lower limit to max_payload (regardless if max_pending is higher). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-08-04 16:33:21 -06:00
Ivan Kozlovic	d7933631a9	[FIXED] Failed route TLS handshake would leave failed conn's lock, locked This is a regression introduced in v2.2.6. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-06-22 14:05:43 -06:00
Jaime Piña	e12181cb83	Return not ready for connection reason Currently, we use ReadyForConnections in server tests to wait for the server to be ready. However, when this fails we don't get a clue about why it failed. This change adds a new unexported method called readyForConnections that returns an error describing which check failed. The exported ReadyForConnections version works exactly as before. The unexported version gets used in internal tests only.	2021-04-20 11:45:08 -07:00
Ivan Kozlovic	a7e5853a3c	Fixed expected pid path in options This was introduced by PR #2071. On some tests, options are loaded based on a config file that has the pid set to "/tm/nats-server/nats-server.pid", however, the expected option's pid path was set based on tmpRoot. The problem is that on macOS, that value would be "/var/folders/xxx" which would not match. So this PR simply reverts the changes to the expected pid file name: it simply needs to match was in the test.conf file. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-08 15:08:23 -06:00
Jaime Piña	d929ee1348	Check errors when removing test directories and files Currently in tests, we have calls to os.Remove and os.RemoveAll where we don't check the returned error. This hides useful error messages when tests fail to run, such as "too many open files". This change checks for more filesystem related errors and calls t.Fatal if there is an error.	2021-04-07 11:09:47 -07:00
Jaime Piña	e44275b963	Consolidate temporary test files and directories Currently, temporary test files and directories are written in lots of different paths within the OS's temp dir. This makes it hard to know which files are from nats-server and which are unrelated. This in turn makes it hard to clean up nats-server test files.	2021-04-06 10:42:55 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	13df1a55fd	Changed warning message Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-09 09:36:30 -06:00
Ivan Kozlovic	df9d5f5fd9	Accepting route warns if remote server has same name Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-08 17:59:33 -06:00
Ivan Kozlovic	2ad2bed170	[ADDED] Support for route hostname resolution We previously simply called DialTimeout() on a route's url when soliciting. If it resolved to the IP of the host, it would create a route to self, which server detects, but then would not try again with other IPs that would have allowed to form a cluster with other servers running on the other IPs. This PR keeps track of local IPs + cluster port and exclude them from the list of IPs returned by LookupHost API. This even prevent solicitation of routes to self. Only non-local IPs will be tried. Resolves #1586 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-09-08 13:40:17 -06:00
Phil Pennock	3c680eceb9	Inhibit Go's default TCP keepalive settings for NATS (#1562 ) Inhibit Go's default TCP keepalive settings for NATS Go 1.13 changed the semantics of the tuning parameters for TCP keepalives, including the default value. This affects all TCP listeners. The NATS protocol has its own L7 keepalive system (PING/PONG) and the Go defaults are not a good fit for some valid deployment scenarios, while Go doesn't directly expose a working API for tuning these. Rather than add a configuration knob and pull in another dependency (with portability issues) just disable TCP keepalives for all listeners used for speaking the NATS protocol. Change the tests so we test the same logic. Do not change HTTP monitoring, profiling, or the websocket API listeners. Change KeepAlive on client connections too.	2020-08-14 13:37:59 -04:00
Ivan Kozlovic	96ccf91566	[FIXED] Possible deadlock with solicited leafnodes when cluster conflict We cannot call c.closeConnection() under the server lock because closeConnection() can invoke server lock in some cases. Created a test that should run without `-race` to reproduce the deadlock (which it does) but sometimes would fail because cluster would not be formed. This unconvered an issue with conflict resolution which test TestRouteClusterNameConflictBetweenStaticAndDynamic() can reproduce easily. The issue was that we were not updating a dynamic name with the remote if the remote was non dynamic. Resolves #1543 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-30 18:45:36 -06:00
Ivan Kozlovic	9288283d90	Fixed accept loops that could leave connections opened This was discovered with the test TestLeafNodeWithGatewaysServerRestart that was sometimes failing. Investigation showed that when cluster B was shutdown, one of the server on A that had a connection from B that just broke tried to reconnect (as part of reconnect retries of implicit gateways) to a server in B that was in the process of shuting down. The connection had been accepted but createGateway not called because the server's running boolean had been set to false as part of the shutdown. However, the connection was not closed so the server on A had a valid connection to a dead server from cluster B. When the B cluster (now single server) was restarted and a LeafNode connection connected to it, then the gateway from B to A was created, that server on A did not create outbound connection to that B server because it already had one (the zombie one). So this PR strengthens the starting of accept loops and also make sure that if a connection (all type of connections) is not accepted because the server is shuting down, that connection is properly closed. Since all accept loops had almost same code, made a generic function that accept functions to call specific create connection functions. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-06 17:03:19 -06:00
Derek Collison	146d8f5dcb	Updates based on feedback, sped up some slow tests Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 17:26:43 -07:00
Derek Collison	dd61535e5a	Cluster names are now required. Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 15:48:38 -07:00
Ivan Kozlovic	b9bd5c2d35	Fixed flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-09 15:34:52 -06:00
Derek Collison	2bd7553c71	System Account on by default. Most of the changes are to turn it off for tests that were watching subscriptions and such. Signed-off-by: Derek Collison <derek@nats.io>	2020-05-29 17:56:45 -07:00
Ivan Kozlovic	f76f0df5ce	Remove update of start in readLoop That broke sending async INFO in case where there was an update between accepting the tcp connection and receiving the CONNECT that indicates that client can receive async INFO. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-25 06:58:23 -07:00
Ivan Kozlovic	a22da91647	[FIXED] Closing of Gateway or Route TLS connection may hang This could happen if the remote server is running but not dequeueing from the socket. TLS connection Close() may send/read and so we need to protect with a deadline. For non client/leaf connection, do not call flushOutbound(). Set the write deadline regardless of handshakeComplete flag, and set it to a low value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-04 17:27:00 -07:00
Ivan Kozlovic	cd9f898eb0	Made a server's helper to set first ping timer Defaults to 1sec but will be opts.PingInterval if value is lower. All non client connections invoked this function for the first PING. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-26 10:21:43 -06:00
Ivan Kozlovic	90d592e163	Leaf and Route RTT When a leaf or route connection is created, set the first ping timer to fire at 1sec, which will allow to compute the RTT reasonably soon (since the PingInterval could be user configured and set much higher). For Route in PR #1101, I was sending the PING on receiving the INFO which required changing bunch of tests. Changing that to also use the first timer interval of 1sec and reverted changes to route tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-26 09:34:17 -06:00
Ivan Kozlovic	7ca8723942	[FIXED] Some Leafnode issues - On startup, verify that local account in leafnode (if specified can be found otherwise fail startup). - At runtime, print error and continue trying to reconnect. Will need to decide a better approach. - When using basic auth (user/password), it was possible for a solicited Leafnode connection to not use user/password when trying an URL that was discovered through gossip. The server now saves the credentials of a configured URL to use with the discovered ones. Updated RouteRTT test in case RTT does not seem to be updated because getting always the same value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-23 14:08:07 -06:00
Ivan Kozlovic	2959b982ea	Merge pull request #1101 from nats-io/route_rtt [ADDED] RTT in routez's route info	2019-08-20 17:23:18 -06:00
Ivan Kozlovic	77c63dbce1	Fix flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 17:07:22 -06:00
Ivan Kozlovic	89dd13f134	[ADDED] RTT in routez's route info Added the RTT field to each route reported in routez. Ensure that when a route is accepted, we send a PING to compute the first RTT and don't have to wait for the ping timer to fire. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 14:16:07 -06:00
Ivan Kozlovic	03930ba0e4	[UPDATED] Reduce report of failed connection attempts This applies to routes, gateways and leaf node connections. The failed attempts will be printed at the first, after the first minute and then every hour. The connect/error statements now include the attempt number. Note that in debug mode, all attempts are traced, so you may get double trace (one for debug, one for info/error). Resolves #969 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-20 10:13:56 -06:00
Derek Collison	d7140a0fd1	Update for client rename Signed-off-by: Derek Collison <derek@nats.io>	2019-05-10 15:11:30 -07:00
Ivan Kozlovic	288f00ff81	Fixed panic when server needs to send message to more than 8 routes Resolves #955 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-04-17 13:02:41 -06:00
Ivan Kozlovic	d654b18476	Fixed reload of boolean flags PR #874 caused an issue in case logtime was actually not configured and not specified in the command line. A reload would then remove logtime. Revisited the fix for that and included other boolean flags, such as debug, trace, etc.. Related to #874 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-01-14 19:18:00 -07:00
Derek Collison	18bca5603f	Added server version and cluster name to statsz. Fixed account connection accounting sending after local connections is 0. Signed-off-by: Derek Collison <derek@nats.io>	2018-12-06 10:57:39 -08:00
Ivan Kozlovic	1817b354e3	Update tests on Travis with tweaked GC settings Moved some tests to "no race" tests that are run separately. Removing -v and adding -p=1. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-11-08 16:56:20 -07:00
Derek Collison	ea5a6d9589	Updates for comments, some golint fixes Signed-off-by: Derek Collison <derek@nats.io>	2018-10-31 20:28:44 -07:00
Derek Collison	47963303f8	First pass at new cluster design Signed-off-by: Derek Collison <derek@nats.io>	2018-10-24 21:29:29 -07:00
Ivan Kozlovic	d5ceade750	Merge pull request #753 from nats-io/route_perms_reload [ADDED] Support for route permissions config reload	2018-09-27 10:08:55 -06:00
Ivan Kozlovic	2a1811b600	Fixed flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-09-26 15:58:48 -06:00
Ivan Kozlovic	e7f5cc82f0	Updates - Use stack buffers - Ensure that buffer size is no greater than 90% of max_pending - Added test with low max_pending Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-09-26 12:19:14 -06:00
Ivan Kozlovic	846544ecfe	Merge pull request #747 from nats-io/update_route_perms [CHANGED] Cluster permissions moved out of cluster's authorization	2018-09-11 10:04:13 -06:00
Ivan Kozlovic	e1202dd30a	[CHANGED] Cluster permissions moved out of cluster's authorization It will be possible to set subjects permissions regardless of the presence of an authorization block. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-09-10 17:03:50 -06:00
Waldemar Quevedo	5e3950df0a	Add Warnf to logger interface Signed-off-by: Waldemar Quevedo <wally@synadia.com>	2018-09-10 14:50:48 -07:00
Derek Collison	90a3a1d8b4	Slow down sweeper to make sure we receive all messages Signed-off-by: Derek Collison <derek@nats.io>	2018-07-02 12:02:59 -07:00
Derek Collison	3b953ce838	Allow localhost to not be defined, only need 127.0.0.1 Signed-off-by: Derek Collison <derek@nats.io>	2018-06-28 16:10:19 -07:00
Ivan Kozlovic	aff1dcf089	Fix some tests Add some helpers to check on some state. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-06-27 17:26:49 -06:00
Derek Collison	844f376140	Performance optimizations, beta3, fixes to various tests. Signed-off-by: Derek Collison <derek@nats.io>	2018-06-11 15:11:03 -07:00
Derek Collison	4dd4d2bd9d	lock users access Signed-off-by: Derek Collison <derek@nats.io>	2018-06-04 17:45:05 -07:00
Derek Collison	26dafe464b	Don't send route unsub with max Signed-off-by: Derek Collison <derek@nats.io>	2018-06-04 17:45:05 -07:00
Derek Collison	049db6e854	Support for queue subscriber retries over routes Signed-off-by: Derek Collison <derek@nats.io>	2018-06-04 17:45:05 -07:00
Derek Collison	50a99241ea	Slow consumer updates and latency improvements. Use pending bytes as slow consumer trigger, so reintroduce max_pending. Improve latency with inplace flush calls when appropriate. Utilize simple time budget for readLoop routine. Signed-off-by: Derek Collison <derek@nats.io>	2018-06-04 17:45:05 -07:00
Ivan Kozlovic	ac42bb0bb9	Remove route connection from temp map When a route connection is created, the server will keep track of the client structure in a special map until the route protocol completes. This is meant so that if the server is shutdown before the route is registered in routes map, the server can kick out the connection's readLoop. The route connection was correctly removed on success, but was not for route connections that were not registered and dropped. This was not causing any issue, but for correctness, doing the removal now when server removes a route connection. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-04-11 16:49:08 -06:00
Ivan Kozlovic	40cf0107d6	Ensure sig handler routine returns on shutdown, turn it off in most tests I noticed that when running the test suite, there would be a file server/log1.txt left. This file is created by one of the config reload test. Running this test individually was doing the proper cleanup. I noticed that the Signal test that was checking that files could be rotated was causing this side effect. It turns out that none of the config reload tests were disabling the signal handler (NoSigs=true), and since the go routine would be left running, running the TestSignalToReOpenLogFile() test would interact with an already finished test. I put a thread dump in handleSignals() to track all tests that were causing this function to start the go routine because NoSigs was not set to true. I fixed all those tests. At this time, there are only 2 tests that need to start the signal handler. I have also fixed the code so that the signal handler routine select on a server quitCh that is closed on shutdown so that this go routine exit and is waiting on using the grWG wait group.	2018-04-06 17:14:02 -06:00
Derek Collison	b2a9ed97d6	Merge pull request #650 from nats-io/cncf Move to CNCF and Apache 2 License	2018-03-16 16:23:10 -07:00

1 2

94 Commits