nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-17 03:24:40 -07:00

Author	SHA1	Message	Date
Derek Collison	3877ee2411	Merge branch 'main' into dev	2022-12-13 13:08:35 -08:00
Marco Primi	f8a030bc4a	Use testing.TempDir() where possible Refactor tests to use go built-in temporary directory utility for tests. Also avoid binding to default port (which may be in use)	2022-12-12 13:18:44 -08:00
Derek Collison	0fd8ad6905	Fix flapping test Signed-off-by: Derek Collison <derek@nats.io>	2022-12-06 05:10:15 -08:00
Ivan Kozlovic	3ec42d5b85	Updates to PR #3611 - Save the TLS name only if not already set - Use the passed URLs slice instead of using s.getOpts().Routes - Enhanced the test - Fixed an unrelated DATA RACE report Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-08 09:36:08 -07:00
Ivan Kozlovic	2d181e1c27	[FIXED] Routing: TLS connections to discovered server may fail The server was not setting "server name" in the TLS configuration for route connections, which may lead to failed (re)connect if the certificate does not allow for the IP and the URL did not have the hostname, which would happen with gossip protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-07 17:26:17 -07:00
Ivan Kozlovic	6113c52ae1	[FIXED] Solicited route may not retry to reconnect Originally, only solicited routes were retried in case of a disconnect, but that was before gossip protocol was introduced. Since then, two servers that connect to each other due to gossip should retry to reconnect if the connection breaks, even if the route is not explicit. However, server will retry only once or more accurately, ConnectRetries+1. This PR solves the issue that the reconnect attempt was not initiated for a "solicited route" that was not explicit. Maybe related to #3571 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-10-24 10:39:23 -06:00
Ivan Kozlovic	b69ffe244e	Fixed some tests Code change: - Do not start the processMirrorMsgs and processSourceMsgs go routine if the server has been detected to be shutdown. This would otherwise leave some go routine running at the end of some tests. - Pass the fch and qch to the consumerFileStore's flushLoop otherwise in some tests this routine could be left running. Tests changes: - Added missing defer NATS connection close - Added missing defer server shutdown Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-08 11:28:23 -06:00
Marco Primi	f1883561ee	Use testing.TB interface instead of *T Using interface allows reusing helper function in benchmarks	2022-08-31 14:52:45 -07:00
Ivan Kozlovic	b6208c775b	[FIXED] Memory leak when unsubscribing the last queue subscription A server maintains a map for the subject+queue to know the number of members on the same group. However, on unsubscribe when we get to the last one being unsubscribed, we were removing from the map but then unfortunately adding back with a value of 0, which caused a leak. If the same subscription was coming back, then this map entry would be reused, but if it is a never coming back queue sub, then memory could increase continously. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-04 18:42:13 -06:00
Matthias Hanel	d53d2d0484	[Added] account specific monitoring endpoint(s) (#3250 ) Added http monitoring endpoint /accstatz It responds with a list of statz for all accounts with local connections the argument "unused=1" can be provided to get statz for all accounts This endpoint is also exposed as nats request under: This monitoring endpoint is exposed via the system account. $SYS.REQ.ACCOUNT..STATZ Each server will respond with connection statistics for the requested account. The format of the data section is a list (size 1) identical to the event $SYS.ACCOUNT.%s.SERVER.CONNS which is sent periodically as well as on connect/disconnect. Unless requested by options, server without the account, or server where the account has no local connections, will not respond. A PING endpoint exists as well. The response format is identical to $SYS.REQ.ACCOUNT..STATZ (however the data section will contain more than one account, if they exist) In addition to general filter options the request takes a list of accounts and an argument to include accounts without local connections (disabled by default) $SYS.REQ.ACCOUNT.PING.STATZ Each account has a new system account import where the local subject $SYS.REQ.ACCOUNT.PING.STATZ essentially responds as if the importing account name was used for $SYS.REQ.ACCOUNT..STATZ The only difference between requesting ACCOUNT.PING.STATZ from within the system account and an account is that the later can only retrieve statz for the account the client requests from. Also exposed the monitoring /healthz via the system account under $SYS.REQ.SERVER..HEALTHZ $SYS.REQ.SERVER.PING.HEALTHZ No dedicated options are available for these. HEALTHZ also accept general filter options. Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-07-12 21:50:32 +02:00
Matthias Hanel	0f113aa3d5	[FIXED] subject renaming with hand crafted reply subject (#3026 ) do so by rejecting jsackprefix in reply subjects Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-04-11 22:32:02 -04:00
Ivan Kozlovic	b4128693ed	Ensure file path is correct during stream restore Also had to change all references from `path.` to `filepath.` when dealing with files, so that it works properly on Windows. Fixed also lots of tests to defer the shutdown of the server after the removal of the storage, and fixed some config files directories to use the single quote `'` to surround the file path, again to work on Windows. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-09 13:31:51 -07:00
Ivan Kozlovic	a025ce7472	Set defaultServerOptions port to -1 for random Updated some tests based on this change but also missing defer connection close or server shutdown. Fixed how the OCSP run go routine would shutdown, which would never complete because grWG was not decremented by this go routine prior to invoking s.Shutdown() Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-09-02 14:22:56 -06:00
Derek Collison	476c264560	If we are in a simple mixed-mode setup with just global account and system account and clustered, allow pass through. Signed-off-by: Derek Collison <derek@nats.io>	2021-08-26 09:41:01 -07:00
Derek Collison	944dd248c4	Fix for tests Signed-off-by: Derek Collison <derek@nats.io>	2021-08-14 17:39:51 -07:00
Ivan Kozlovic	4865dc7ae3	[CHANGED] Check that max_payload is not greater than max_pending This is related to PR #2407. Since the 64MB pending size is actually configurable, we should fail only if max_payload is greater than the configured max_pending. This is done in validateOptions() which covers both config file and direct options in embedded cases. The check in opts.go is reverted to max int32 since at this point we don't know if/what max_pending will be, so we simply check that it is not more than a int32. For the next minor release, we could have another change that imposes a lower limit to max_payload (regardless if max_pending is higher). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-08-04 16:33:21 -06:00
Ivan Kozlovic	d7933631a9	[FIXED] Failed route TLS handshake would leave failed conn's lock, locked This is a regression introduced in v2.2.6. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-06-22 14:05:43 -06:00
Jaime Piña	e12181cb83	Return not ready for connection reason Currently, we use ReadyForConnections in server tests to wait for the server to be ready. However, when this fails we don't get a clue about why it failed. This change adds a new unexported method called readyForConnections that returns an error describing which check failed. The exported ReadyForConnections version works exactly as before. The unexported version gets used in internal tests only.	2021-04-20 11:45:08 -07:00
Ivan Kozlovic	a7e5853a3c	Fixed expected pid path in options This was introduced by PR #2071. On some tests, options are loaded based on a config file that has the pid set to "/tm/nats-server/nats-server.pid", however, the expected option's pid path was set based on tmpRoot. The problem is that on macOS, that value would be "/var/folders/xxx" which would not match. So this PR simply reverts the changes to the expected pid file name: it simply needs to match was in the test.conf file. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-08 15:08:23 -06:00
Jaime Piña	d929ee1348	Check errors when removing test directories and files Currently in tests, we have calls to os.Remove and os.RemoveAll where we don't check the returned error. This hides useful error messages when tests fail to run, such as "too many open files". This change checks for more filesystem related errors and calls t.Fatal if there is an error.	2021-04-07 11:09:47 -07:00
Jaime Piña	e44275b963	Consolidate temporary test files and directories Currently, temporary test files and directories are written in lots of different paths within the OS's temp dir. This makes it hard to know which files are from nats-server and which are unrelated. This in turn makes it hard to clean up nats-server test files.	2021-04-06 10:42:55 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	13df1a55fd	Changed warning message Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-09 09:36:30 -06:00
Ivan Kozlovic	df9d5f5fd9	Accepting route warns if remote server has same name Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-08 17:59:33 -06:00
Ivan Kozlovic	2ad2bed170	[ADDED] Support for route hostname resolution We previously simply called DialTimeout() on a route's url when soliciting. If it resolved to the IP of the host, it would create a route to self, which server detects, but then would not try again with other IPs that would have allowed to form a cluster with other servers running on the other IPs. This PR keeps track of local IPs + cluster port and exclude them from the list of IPs returned by LookupHost API. This even prevent solicitation of routes to self. Only non-local IPs will be tried. Resolves #1586 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-09-08 13:40:17 -06:00
Phil Pennock	3c680eceb9	Inhibit Go's default TCP keepalive settings for NATS (#1562 ) Inhibit Go's default TCP keepalive settings for NATS Go 1.13 changed the semantics of the tuning parameters for TCP keepalives, including the default value. This affects all TCP listeners. The NATS protocol has its own L7 keepalive system (PING/PONG) and the Go defaults are not a good fit for some valid deployment scenarios, while Go doesn't directly expose a working API for tuning these. Rather than add a configuration knob and pull in another dependency (with portability issues) just disable TCP keepalives for all listeners used for speaking the NATS protocol. Change the tests so we test the same logic. Do not change HTTP monitoring, profiling, or the websocket API listeners. Change KeepAlive on client connections too.	2020-08-14 13:37:59 -04:00
Ivan Kozlovic	96ccf91566	[FIXED] Possible deadlock with solicited leafnodes when cluster conflict We cannot call c.closeConnection() under the server lock because closeConnection() can invoke server lock in some cases. Created a test that should run without `-race` to reproduce the deadlock (which it does) but sometimes would fail because cluster would not be formed. This unconvered an issue with conflict resolution which test TestRouteClusterNameConflictBetweenStaticAndDynamic() can reproduce easily. The issue was that we were not updating a dynamic name with the remote if the remote was non dynamic. Resolves #1543 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-30 18:45:36 -06:00
Ivan Kozlovic	9288283d90	Fixed accept loops that could leave connections opened This was discovered with the test TestLeafNodeWithGatewaysServerRestart that was sometimes failing. Investigation showed that when cluster B was shutdown, one of the server on A that had a connection from B that just broke tried to reconnect (as part of reconnect retries of implicit gateways) to a server in B that was in the process of shuting down. The connection had been accepted but createGateway not called because the server's running boolean had been set to false as part of the shutdown. However, the connection was not closed so the server on A had a valid connection to a dead server from cluster B. When the B cluster (now single server) was restarted and a LeafNode connection connected to it, then the gateway from B to A was created, that server on A did not create outbound connection to that B server because it already had one (the zombie one). So this PR strengthens the starting of accept loops and also make sure that if a connection (all type of connections) is not accepted because the server is shuting down, that connection is properly closed. Since all accept loops had almost same code, made a generic function that accept functions to call specific create connection functions. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-06 17:03:19 -06:00
Derek Collison	146d8f5dcb	Updates based on feedback, sped up some slow tests Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 17:26:43 -07:00
Derek Collison	dd61535e5a	Cluster names are now required. Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 15:48:38 -07:00
Ivan Kozlovic	b9bd5c2d35	Fixed flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-09 15:34:52 -06:00
Derek Collison	2bd7553c71	System Account on by default. Most of the changes are to turn it off for tests that were watching subscriptions and such. Signed-off-by: Derek Collison <derek@nats.io>	2020-05-29 17:56:45 -07:00
Ivan Kozlovic	f76f0df5ce	Remove update of start in readLoop That broke sending async INFO in case where there was an update between accepting the tcp connection and receiving the CONNECT that indicates that client can receive async INFO. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-25 06:58:23 -07:00
Ivan Kozlovic	a22da91647	[FIXED] Closing of Gateway or Route TLS connection may hang This could happen if the remote server is running but not dequeueing from the socket. TLS connection Close() may send/read and so we need to protect with a deadline. For non client/leaf connection, do not call flushOutbound(). Set the write deadline regardless of handshakeComplete flag, and set it to a low value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-04 17:27:00 -07:00
Ivan Kozlovic	cd9f898eb0	Made a server's helper to set first ping timer Defaults to 1sec but will be opts.PingInterval if value is lower. All non client connections invoked this function for the first PING. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-26 10:21:43 -06:00
Ivan Kozlovic	90d592e163	Leaf and Route RTT When a leaf or route connection is created, set the first ping timer to fire at 1sec, which will allow to compute the RTT reasonably soon (since the PingInterval could be user configured and set much higher). For Route in PR #1101, I was sending the PING on receiving the INFO which required changing bunch of tests. Changing that to also use the first timer interval of 1sec and reverted changes to route tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-26 09:34:17 -06:00
Ivan Kozlovic	7ca8723942	[FIXED] Some Leafnode issues - On startup, verify that local account in leafnode (if specified can be found otherwise fail startup). - At runtime, print error and continue trying to reconnect. Will need to decide a better approach. - When using basic auth (user/password), it was possible for a solicited Leafnode connection to not use user/password when trying an URL that was discovered through gossip. The server now saves the credentials of a configured URL to use with the discovered ones. Updated RouteRTT test in case RTT does not seem to be updated because getting always the same value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-23 14:08:07 -06:00
Ivan Kozlovic	2959b982ea	Merge pull request #1101 from nats-io/route_rtt [ADDED] RTT in routez's route info	2019-08-20 17:23:18 -06:00
Ivan Kozlovic	77c63dbce1	Fix flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 17:07:22 -06:00
Ivan Kozlovic	89dd13f134	[ADDED] RTT in routez's route info Added the RTT field to each route reported in routez. Ensure that when a route is accepted, we send a PING to compute the first RTT and don't have to wait for the ping timer to fire. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 14:16:07 -06:00
Ivan Kozlovic	03930ba0e4	[UPDATED] Reduce report of failed connection attempts This applies to routes, gateways and leaf node connections. The failed attempts will be printed at the first, after the first minute and then every hour. The connect/error statements now include the attempt number. Note that in debug mode, all attempts are traced, so you may get double trace (one for debug, one for info/error). Resolves #969 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-20 10:13:56 -06:00
Derek Collison	d7140a0fd1	Update for client rename Signed-off-by: Derek Collison <derek@nats.io>	2019-05-10 15:11:30 -07:00
Ivan Kozlovic	288f00ff81	Fixed panic when server needs to send message to more than 8 routes Resolves #955 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-04-17 13:02:41 -06:00
Ivan Kozlovic	d654b18476	Fixed reload of boolean flags PR #874 caused an issue in case logtime was actually not configured and not specified in the command line. A reload would then remove logtime. Revisited the fix for that and included other boolean flags, such as debug, trace, etc.. Related to #874 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-01-14 19:18:00 -07:00
Derek Collison	18bca5603f	Added server version and cluster name to statsz. Fixed account connection accounting sending after local connections is 0. Signed-off-by: Derek Collison <derek@nats.io>	2018-12-06 10:57:39 -08:00
Ivan Kozlovic	1817b354e3	Update tests on Travis with tweaked GC settings Moved some tests to "no race" tests that are run separately. Removing -v and adding -p=1. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-11-08 16:56:20 -07:00
Derek Collison	ea5a6d9589	Updates for comments, some golint fixes Signed-off-by: Derek Collison <derek@nats.io>	2018-10-31 20:28:44 -07:00
Derek Collison	47963303f8	First pass at new cluster design Signed-off-by: Derek Collison <derek@nats.io>	2018-10-24 21:29:29 -07:00
Ivan Kozlovic	d5ceade750	Merge pull request #753 from nats-io/route_perms_reload [ADDED] Support for route permissions config reload	2018-09-27 10:08:55 -06:00
Ivan Kozlovic	2a1811b600	Fixed flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2018-09-26 15:58:48 -06:00

1 2 3

109 Commits