nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-14 10:10:42 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	47b08335a4	[FIXED] Reset of tlsName only for x509.HostnameError For issue #1256, we cleared the possibly saved tlsName on Hanshake failure. However, this meant that for normal use cases, if a reconnect failed for any reason we would not be able to reconnect if it is an IP until we get back to the URL that contained the hostname. We now clear only if the handshake error is of x509.HostnameError type, which include errors such as: ``` "x509: Common Name is not a valid hostname: <x>" "x509: cannot validate certificate for <x> because it doesn't contain any IP SANs" "x509: certificate is not valid for any names, but wanted to match <x>" "x509: certificate is valid for <x>, not <y>" ``` Applied the same logic to solicited gateway connections, and fixed the fact that the tlsConfig should be cloned (since we set the ServerName). I have also made a change for leafnode connections similar to what we are doing for gateway connections, which is to use the saved tlsName only if tlsConfig.ServerName is empty, which may not be the case for users that embed NATS Server and pass directly tls configuration. In other words, if the option TLSConfig.ServerName is not empty, always use this value. Relates to #1256 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-01-28 13:16:38 -07:00
Derek Collison	643e73c0c5	Fix for #1256 , mixed IP and DNS for cluster and TLS with leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2020-01-22 11:25:09 -08:00
Ivan Kozlovic	bdd7fa86e9	Update flapping test We need to wait for the route close to be processed before attempting to recreate it. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-01-10 12:01:59 -07:00
Ivan Kozlovic	c097357b52	[FIXED] More than expected switch to Interest-Only mode for account When an account is switched to interest-only mode due to no interest, it was not possible to switch that account more than once. But the function switchAccountToInterestMode() that triggers a switch could possibly doing it more than once. This should not cause problems but increased the number of traces in a big super cluster. Also fixed some flappers and a data race. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-01-09 13:35:08 -07:00
Ivan Kozlovic	b42856afa2	Set expectConnect flag for CLIENT only if auth required Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-01-07 10:48:11 -07:00
Ivan Kozlovic	c73be88ac0	Updated based on comments Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-01-06 16:57:48 -07:00
Ivan Kozlovic	947798231b	[UPDATED] TCP Write and SlowConsumer handling - All writes will now be done by the writeLoop, unless when the writeLoop has not been started yet (likely in connection init). - Slow consumers for non CLIENT connections will be reported but not failed. The idea is that routes, gateway, etc.. connections should stay connected as much as possible. However if a flush operation times out and no data at all has been written, the connection will be closed (regardless of type). - Slow consumers due to max pending is only for CLIENT connections. This allows sending of SUBs through routes, etc.. to not have to be chunked. - The backpressure to CLIENT connections is increased (up to 1sec) based on the sub's connection pending bytes level. - Connection is flushed on close from the writeLoop as to not block the "fast path". Some tests have been fixed and adapted since now closeConnection() is not flushing/closing/removing connection in place. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-31 15:06:27 -07:00
Ivan Kozlovic	1b2754475b	Refactor async client tests Updated all tests that use "async" clients. - start the writeLoop (this is in preparation for changes in the server that will not do send-in-place for some protocols, such as PING, etc..) - Added missing defers in several tests - fixed an issue in client.go where test was wrong possibly causing a panic. - Had to skip a test for now since it would fail without server code change. The next step will be ensure that all protocols are sent through the writeLoop and that the data is properly flushed on close (important for -ERR for instance). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-12 11:58:24 -07:00
Derek Collison	ffc3c0da70	Fixed #1144 , qsub performance improvements Signed-off-by: Derek Collison <derek@nats.io>	2019-12-09 22:08:59 +01:00
Ivan Kozlovic	ae99fc3a2a	Fixed issues reported by staticcheck Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-04 17:04:58 -07:00
Ivan Kozlovic	63138509f7	Tune some code/test for Windows Running test suite on a Windows VM, I notice several failures. Updated the compute of the RTT to be at least 1ns. I think that this is just an issue with the VM I am running, but that change will have no impact for normal situations (since setting the rtt to the very minimum duration (1ns) instead of 0) and will prevent some tests from failing. Because of those same timer granularity issues, I had to add some delays between some actions in order for time.Sub()/Since() to actually report something more than 0. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-21 14:32:46 -07:00
Ivan Kozlovic	977c290bf2	[FIXED] Handling of split buffer for LEAF messages Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-18 11:55:18 -07:00
Derek Collison	07253c0517	Merge pull request #1196 from nats-io/daisy Allow interest propagation with daisy-chained leafnodes	2019-11-17 17:46:23 -08:00
Derek Collison	07da68ce56	Allow interest propagation with daisy chained leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2019-11-17 17:35:20 -08:00
Ivan Kozlovic	e0bc81d0ed	Make the Leafnode internal sub on _GR_.> This is needed for mapped gateway replies. We had used an extra token when implementing the new prefix, but it was then removed, but the leafnode subscription on _GR_...*.> was not updated. We now subscribe on _GR_.> There was a test that was passing because we were using inboxes that caused the pattern to match. Replaced with single token reply so that it would have caught this bug. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-17 17:37:09 -07:00
Derek Collison	7b1bea61e2	Merge pull request #1192 from nats-io/load_account Do not fetch accounts on system events.	2019-11-16 18:33:23 -08:00
Derek Collison	f60266bc2e	Merge pull request #1190 from nats-io/import_reply Introduced wildcard handling of _R_ mapped replies.	2019-11-16 18:07:18 -08:00
Derek Collison	093b57ed40	Do not fetch accounts on system events. Noticed we would lookup accounts, but would also fetch them when tracking remote connections, etc. Signed-off-by: Derek Collison <derek@nats.io>	2019-11-16 18:05:42 -08:00
Ivan Kozlovic	3e1728d623	[FIXED] Some accounts locking issues - Risk of deadlock when checking if issuer claim are trusted. There was a RLock() in one thread, then a request for Lock() in another that was waiting for RLock() to return, but the first thread was then doing RLock() which was not acquired because this was blocked by the Lock() request (see `e2160cc571`) - Use proper account/locking mode when checking if stream/service exports/signer have changed. - Account registration race (regression from https://github.com/nats-io/nats-server/pull/890) - Move test from #890 to "no race" test since only then could it detect the double registration. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-16 16:59:38 -07:00
Derek Collison	6ad8287bbe	Introduced wildcard handling of _R_ mapped replies. We had too much special processing, so reduced to a single wildcard which will propagate across routes and gateways and is consistent with gateway handling of globally routed subjects and timeouts. Signed-off-by: Derek Collison <derek@nats.io>	2019-11-16 12:50:53 -08:00
Ivan Kozlovic	bdf5cf63b3	Shutdown on Ctrl+C Changed code on Windows to not use svc code if running in interactive mode. The original code was running svc.debug.Run() which uses service code (Execute()) but from the command line. We don't need that. Also reduced salt on bcrypt password for a config file that started to cause failures due to test taking too long to finish. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-14 20:05:32 -07:00
Derek Collison	3330820502	Fixed a bug where we leaked service imports. Also prior this would have leaked subscriptions as well. Signed-off-by: Derek Collison <derek@nats.io>	2019-11-14 13:29:17 -08:00
Ivan Kozlovic	3e5ede1d64	Relax check on reserved GW prefix for system clients Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-11 17:43:14 -07:00
Ivan Kozlovic	aa843945c9	Work on Gateways reply mapping - New prefix that includes origin server for the request - Mapping done if request is service import or requestor has recent subscription - Subscription considered recent if less than 250ms - Destination server strip GW prefix before giving to client and restore when getting a reply on that subject - Mapping removed aftert 250ms - Server rejects client publish on "$GNR." (the new prefix) - Cluster and server hash are now 8 chars long and from base 62 alphabets - Mapped replies need to be sent to leafnode servers due to race (cluster B sends RS+ on GW inbound then RMSG on outbound, the RS+ may be processed later and cluster A may have given message to LN before RS+ on reply subject. So LN needs to accept the mapped reply but will strip to give to client and reassemble before sending it back) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-06 16:06:49 -07:00
Derek Collison	8a69c5cb71	Updates to benchmarks Allow disabling of short first ping timer for clients. Adjust names so that full test suite results are aligned. Removed the account lookup, we use sync.Map but also a no-lock cache. Signed-off-by: Derek Collison <derek@nats.io>	2019-11-02 08:04:22 -07:00
Derek Collison	f0f807f99a	After speaking with Ivan we are taking a better approach for initial RTT. Ivan had the idea of using the CONNECT to establish a first estimate of RTT without additional PING/PONGs. Signed-off-by: Derek Collison <derek@nats.io>	2019-10-31 14:01:55 -07:00
Derek Collison	13f217635f	Wait on requestor RTT when tracking latency. If a client RTT for a requestor is longer than a service RTT, the requestor latency was often zero. We now wait for the RTT (if zero) before sending out the metric. Signed-off-by: Derek Collison <derek@nats.io>	2019-10-31 08:02:45 -07:00
Ivan Kozlovic	cbbc21ac25	Some update to leafnode subscription handling - Send all subs in place if smap is small - Skip sending update until after sendAllLeafSubs() is done Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-10-30 20:01:49 -06:00
Ivan Kozlovic	fe27aec1dc	Merge pull request #1170 from nats-io/fix_detect_leafnode_loop [FIXED] Detect loop between LeafNode servers	2019-10-29 18:35:20 -06:00
Ivan Kozlovic	e3009ffb6e	Fix latency test flapper. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-10-29 17:03:48 -06:00
Ivan Kozlovic	279cab2aaf	[FIXED] Detect loop between LeafNode servers This is achieved by subscribing to a unique subject. If the LS+ protocol is coming back for the same subject on the same account, then this indicates a loop. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-10-29 16:14:35 -06:00
Derek Collison	35758ef7d4	Update the test CA and certs. Expiration is now Oct 14 14:30:41 2029 GMT Signed-off-by: Derek Collison <derek@nats.io>	2019-10-17 07:33:08 -07:00
Derek Collison	7cb6056a94	Account support for Connz and user or account filtering 1. Accounts will show up in connection info if auth=1. 2. You can filter by user (?auth=1&user=ivan) or account (?auth=1&acc=eng) Signed-off-by: Derek Collison <derek@nats.io>	2019-10-11 10:22:08 -07:00
Waldemar Quevedo	d44b0dec51	Merge pull request #1136 from nats-io/svc-latency-values Adjust to zero negative latency values	2019-09-20 11:39:33 -05:00
Waldemar Quevedo	d0e36f3b88	Adjust to zero negative latency values Signed-off-by: Waldemar Quevedo <wally@synadia.com>	2019-09-20 09:24:18 -07:00
Derek Collison	7fe47ace2b	Make sure to turn latency on with a claim update Signed-off-by: Derek Collison <derek@nats.io>	2019-09-19 14:20:35 -07:00
Derek Collison	0551371b31	Add in JWT support for tracking latency Signed-off-by: Derek Collison <derek@nats.io>	2019-09-18 08:51:43 -07:00
Derek Collison	b98b75b166	Merge pull request #1127 from nats-io/sysdebug System level services for debugging.	2019-09-17 09:45:53 -07:00
Derek Collison	52430c304a	System level services for debugging. This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems. The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards. Signed-off-by: Derek Collison <derek@nats.io>	2019-09-17 09:37:35 -07:00
Ivan Kozlovic	15201a19cd	Fixed a lock inversion issue with account In updateRouteSubscriptionMap(), when a queue sub is added/removed, the code locks the account and then the route to send the update. However, when a route is accepted and the subs are sent, the opposite (locking wise) occurs. The route is locked, then the account. This lock inversion is possible because a route is registered (added to the server's map) and then the subs are sent. Use a special lock to protect the send, but don't hold the acc.mu lock while getting the route's lock. The tests that were created for the original missed queue updates issue, namely TestClusterLeaksSubscriptions() and TestQueueSubWeightOrderMultipleConnections() pass with this change. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-09-13 14:30:00 -06:00
Derek Collison	94f143ccce	Latency tracking updates. Will now breakout the internal NATS latency to show requestor client RTT, responder client RTT and any internal latency caused by hopping between servers, etc. Signed-off-by: Derek Collison <derek@nats.io>	2019-09-11 16:43:19 -07:00
Derek Collison	bb11f7bd2d	Merge pull request #1111 from nats-io/latency Track latency for exported services	2019-08-30 11:02:36 -07:00
Derek Collison	7989118c3f	First pass latency tracking for exported services Signed-off-by: Derek Collison <derek@nats.io>	2019-08-30 10:52:48 -07:00
Ivan Kozlovic	2a8973a62b	Fixed flushOutbound With Go 1.12 (strangely was not able to reproduce with Go 1.11) the test TestRouteNoCrashOnAddingSubToRoute() would frequently locks up and consume all avail CPUs on the machine. Running this test with GOMAXPROCS=2 you would see server.test CPU usage pegged at 200% (assuming you have at least 2 CPUs). The reason was that the writeLoop was spinning because another routine was already in flushOutbound() and stack trace would show that it was stuck in system calls. It seems that even though the writeLoop does release the lock but grab it right away was not allowing the syscall to complete. So decided to put back the unlock/gosched/lock back in flushOutbound() when flag is already set, but then protect the closeConnection() with its own flag (similar to clearConnection) to not re-introduce issue fixed in #1092. Had to fix the benchmark test RoutedInterestGraph because after a route is accepted, the initial PING will be sent after 1sec which was breaking this test. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-29 12:59:27 -06:00
Ivan Kozlovic	90d592e163	Leaf and Route RTT When a leaf or route connection is created, set the first ping timer to fire at 1sec, which will allow to compute the RTT reasonably soon (since the PingInterval could be user configured and set much higher). For Route in PR #1101, I was sending the PING on receiving the INFO which required changing bunch of tests. Changing that to also use the first timer interval of 1sec and reverted changes to route tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-26 09:34:17 -06:00
Ivan Kozlovic	7ca8723942	[FIXED] Some Leafnode issues - On startup, verify that local account in leafnode (if specified can be found otherwise fail startup). - At runtime, print error and continue trying to reconnect. Will need to decide a better approach. - When using basic auth (user/password), it was possible for a solicited Leafnode connection to not use user/password when trying an URL that was discovered through gossip. The server now saves the credentials of a configured URL to use with the discovered ones. Updated RouteRTT test in case RTT does not seem to be updated because getting always the same value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-23 14:08:07 -06:00
Ivan Kozlovic	2959b982ea	Merge pull request #1101 from nats-io/route_rtt [ADDED] RTT in routez's route info	2019-08-20 17:23:18 -06:00
Ivan Kozlovic	77c63dbce1	Fix flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 17:07:22 -06:00
Ivan Kozlovic	89dd13f134	[ADDED] RTT in routez's route info Added the RTT field to each route reported in routez. Ensure that when a route is accepted, we send a PING to compute the first RTT and don't have to wait for the ping timer to fire. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 14:16:07 -06:00
Ivan Kozlovic	e230e7fde9	Attempt at fixing flapper again Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-15 09:06:56 -06:00

1 2 3 4 5 ...

418 Commits