nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-13 17:58:00 -07:00

Author	SHA1	Message	Date
Waldemar Quevedo	d0e36f3b88	Adjust to zero negative latency values Signed-off-by: Waldemar Quevedo <wally@synadia.com>	2019-09-20 09:24:18 -07:00
Derek Collison	7fe47ace2b	Make sure to turn latency on with a claim update Signed-off-by: Derek Collison <derek@nats.io>	2019-09-19 14:20:35 -07:00
Derek Collison	0551371b31	Add in JWT support for tracking latency Signed-off-by: Derek Collison <derek@nats.io>	2019-09-18 08:51:43 -07:00
Derek Collison	b98b75b166	Merge pull request #1127 from nats-io/sysdebug System level services for debugging.	2019-09-17 09:45:53 -07:00
Derek Collison	52430c304a	System level services for debugging. This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems. The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards. Signed-off-by: Derek Collison <derek@nats.io>	2019-09-17 09:37:35 -07:00
Ivan Kozlovic	15201a19cd	Fixed a lock inversion issue with account In updateRouteSubscriptionMap(), when a queue sub is added/removed, the code locks the account and then the route to send the update. However, when a route is accepted and the subs are sent, the opposite (locking wise) occurs. The route is locked, then the account. This lock inversion is possible because a route is registered (added to the server's map) and then the subs are sent. Use a special lock to protect the send, but don't hold the acc.mu lock while getting the route's lock. The tests that were created for the original missed queue updates issue, namely TestClusterLeaksSubscriptions() and TestQueueSubWeightOrderMultipleConnections() pass with this change. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-09-13 14:30:00 -06:00
Derek Collison	94f143ccce	Latency tracking updates. Will now breakout the internal NATS latency to show requestor client RTT, responder client RTT and any internal latency caused by hopping between servers, etc. Signed-off-by: Derek Collison <derek@nats.io>	2019-09-11 16:43:19 -07:00
Derek Collison	bb11f7bd2d	Merge pull request #1111 from nats-io/latency Track latency for exported services	2019-08-30 11:02:36 -07:00
Derek Collison	7989118c3f	First pass latency tracking for exported services Signed-off-by: Derek Collison <derek@nats.io>	2019-08-30 10:52:48 -07:00
Ivan Kozlovic	2a8973a62b	Fixed flushOutbound With Go 1.12 (strangely was not able to reproduce with Go 1.11) the test TestRouteNoCrashOnAddingSubToRoute() would frequently locks up and consume all avail CPUs on the machine. Running this test with GOMAXPROCS=2 you would see server.test CPU usage pegged at 200% (assuming you have at least 2 CPUs). The reason was that the writeLoop was spinning because another routine was already in flushOutbound() and stack trace would show that it was stuck in system calls. It seems that even though the writeLoop does release the lock but grab it right away was not allowing the syscall to complete. So decided to put back the unlock/gosched/lock back in flushOutbound() when flag is already set, but then protect the closeConnection() with its own flag (similar to clearConnection) to not re-introduce issue fixed in #1092. Had to fix the benchmark test RoutedInterestGraph because after a route is accepted, the initial PING will be sent after 1sec which was breaking this test. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-29 12:59:27 -06:00
Ivan Kozlovic	90d592e163	Leaf and Route RTT When a leaf or route connection is created, set the first ping timer to fire at 1sec, which will allow to compute the RTT reasonably soon (since the PingInterval could be user configured and set much higher). For Route in PR #1101, I was sending the PING on receiving the INFO which required changing bunch of tests. Changing that to also use the first timer interval of 1sec and reverted changes to route tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-26 09:34:17 -06:00
Ivan Kozlovic	7ca8723942	[FIXED] Some Leafnode issues - On startup, verify that local account in leafnode (if specified can be found otherwise fail startup). - At runtime, print error and continue trying to reconnect. Will need to decide a better approach. - When using basic auth (user/password), it was possible for a solicited Leafnode connection to not use user/password when trying an URL that was discovered through gossip. The server now saves the credentials of a configured URL to use with the discovered ones. Updated RouteRTT test in case RTT does not seem to be updated because getting always the same value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-23 14:08:07 -06:00
Ivan Kozlovic	2959b982ea	Merge pull request #1101 from nats-io/route_rtt [ADDED] RTT in routez's route info	2019-08-20 17:23:18 -06:00
Ivan Kozlovic	77c63dbce1	Fix flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 17:07:22 -06:00
Ivan Kozlovic	89dd13f134	[ADDED] RTT in routez's route info Added the RTT field to each route reported in routez. Ensure that when a route is accepted, we send a PING to compute the first RTT and don't have to wait for the ping timer to fire. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 14:16:07 -06:00
Ivan Kozlovic	e230e7fde9	Attempt at fixing flapper again Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-15 09:06:56 -06:00
Ivan Kozlovic	fc8087daa7	Updates based on comments - add sha256 algo - move some mem hungry tests while running with -race to the norace - remove GOGC=10 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-15 09:06:56 -06:00
Ivan Kozlovic	07e3db6b8e	Prepare for v2.0.4 with goreleaser Also fixed some flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-15 09:06:56 -06:00
Derek Collison	507432648b	flapper Signed-off-by: Derek Collison <derek@nats.io>	2019-07-28 07:10:37 -07:00
Derek Collison	8bfe14bbfd	check response perms more often, make sure we limit memory growth Signed-off-by: Derek Collison <derek@nats.io>	2019-07-25 16:53:54 -07:00
Derek Collison	495a1a7ec3	Allow dynamic publish permissions based on reply subjects of received msgs Signed-off-by: Derek Collison <derek@nats.io>	2019-07-25 13:17:26 -07:00
Derek Collison	1d6c58074f	Fix for #1065 (leaked subscribers from dq subs across routes) Signed-off-by: Derek Collison <derek@nats.io>	2019-07-22 17:17:43 -07:00
Alberto Ricart	273e5af0a8	Fixed an issue where the leaf authentication was not checking for account/signers, so user JWTs signed by a signer failed authentication.	2019-07-17 16:03:55 -04:00
Ivan Kozlovic	0873b46f67	[FIXED] LeafNode urls may be missing in INFO sent to LN connections When a cluster of servers are having routes to each other, there is a chance that the list of leafnode URLs maintained on each server is not complete. This would result in LN servers connecting to this cluster to not get the full list of possible URLs the server could reconnect to. Also fixed a DATA RACE that appeared when running the updated TestLeafNodeInfoURLs test. Fixed the race and added specific test that easily demonstrated the race: TestLeafNodeNoRaceGeneratingNonce Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-07-12 19:15:30 -06:00
Derek Collison	18a2c357e4	Merge pull request #1072 from nats-io/handshake Report authorization error and use TLS hostname for IPs on leafnodes.	2019-07-12 14:11:53 -07:00
Derek Collison	a795920dc3	Report authorization error and use TLS hostname for IPs on leafnodes. Signed-off-by: Derek Collison <derek@nats.io>	2019-07-12 13:57:16 -07:00
Ivan Kozlovic	37d08a6c56	[FIXED] Allow TLS InsecureSkipVerify again This has an effect only on connections created by the server, so routes and gateways (explicit and implicit). Make sure that an explicit warning is printed if the insecure property is set, but otherwise allow it. Resolves #1062 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-07-12 12:10:28 -06:00
Derek Collison	951ae49100	Prevent multiple solicited leafnodes from forming cycles. When a solicited leafnode comes from multiple servers that themselves are a cluster, cycles were formed. This change allows solicited leafnodes to behave similar to gateways in that each server of a cluster is expected to have a solicted leafnode per destination account and cluster. We no longer forward subscription interest or messages to a cluster from a server that has a solicited leafnode. Signed-off-by: Derek Collison <derek@nats.io>	2019-07-10 20:16:47 -07:00
Derek Collison	10d4f1ab7a	Convert leafnode solicited remotes to array Signed-off-by: Derek Collison <derek@nats.io>	2019-07-10 11:53:34 -07:00
Derek Collison	a61d32a82c	Test for staggered leafnodes and sub/pub. Verifies fix for #1066 Signed-off-by: Derek Collison <derek@nats.io>	2019-07-10 09:57:43 -07:00
Derek Collison	074c87d49e	Merge pull request #1060 from nats-io/gr Make sure we route responses across leafnodes	2019-07-08 17:07:57 -07:00
Derek Collison	49707317a1	Make sure we route responses across leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2019-07-08 16:20:40 -07:00
Derek Collison	f76a6b9a5c	When a bound account's maxpayload is not the same make sure we send it to clients that can do async INFO. Signed-off-by: Derek Collison <derek@nats.io>	2019-07-08 15:20:23 -07:00
Ivan Kozlovic	156511bba7	[FIXED] Check of maxpayload could be bypassed if size overruns int32 One could craft a PUB protocol to cause server to panic. This can happen if the size in the PUB protocol overruns an int32. (note that if authorization is enabled, the user would need to authenticate first, limiting the impact). Thank you to Aviv Sasson and Ariel Zelivansky from Twistlock for the security report! Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-07-01 15:06:08 -06:00
Derek Collison	e83e0a7f5c	Merge pull request #1048 from nats-io/ping Stager first ping from server and suppress pings if a ping was received.	2019-07-01 12:06:32 -07:00
Derek Collison	e11a959584	Send ping when RTT update needed Signed-off-by: Derek Collison <derek@nats.io>	2019-07-01 11:58:06 -07:00
Derek Collison	8a3db71ad5	Updates from comments Signed-off-by: Derek Collison <derek@nats.io>	2019-07-01 08:47:13 -07:00
Derek Collison	100d0d2b02	Use default port for leafnode remote if not specified Signed-off-by: Derek Collison <derek@nats.io>	2019-06-29 17:50:21 -07:00
Derek Collison	ebd4deb8b9	Stager first ping from server and suppress pings if a ping was received. Signed-off-by: Derek Collison <derek@nats.io>	2019-06-29 15:43:15 -07:00
Derek Collison	5b42b99dc1	Allow operator to be inline JWT. Also preloads just warn on validation issues, do not stop starting or reloads. We issue validation warnings now to the log. Signed-off-by: Derek Collison <derek@nats.io>	2019-06-24 16:46:22 -07:00
Derek Collison	d1a782e014	Messages not distributed evenly when sourced from leafnode. When messages came from a leafnode there were not being distributed evenly to the destination cluster. Signed-off-by: Derek Collison <derek@nats.io>	2019-06-11 20:37:49 -07:00
Ivan Kozlovic	ed1901c792	Update go.mod to satisfy v2 requirements Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-06-03 19:45:47 -06:00
Derek Collison	2a8e630bf1	Fix for leafnode and dq selection over GWs Signed-off-by: Derek Collison <derek@nats.io>	2019-06-01 16:43:54 -07:00
Derek Collison	adba6dc023	Add in leafnode bound account events for accounting Signed-off-by: Derek Collison <derek@nats.io>	2019-05-31 16:58:27 -07:00
Derek Collison	3cf6f6a5d2	Bug fix for service import with leafnodes and gws Signed-off-by: Derek Collison <derek@nats.io>	2019-05-31 11:22:02 -07:00
Derek Collison	874f06a212	Fix bugs on reloadAuthorization When tls is on routes it can cause reloadAuthorization to be called. We were assuming configured accounts, but did not copy the remote map. This copies the remote map when transferring for configured accounts and also handles operator mode. In operator mode we leave the accounts in place, and if we have a memory resolver we will remove accounts that are not longer defined or have bad claims. Signed-off-by: Derek Collison <derek@nats.io>	2019-05-29 13:19:58 -07:00
Ivan Kozlovic	4ed08dde07	Merge pull request #1013 from nats-io/fix_gw_qinterest_loss Fixed loss of queue subscription interest across Gateways in some cases	2019-05-26 18:23:06 -06:00
Ivan Kozlovic	ce1e6defab	Fix flappers - TestSystemAccountConnectionUpdatesStopAfterNoLocal: I believe that the check on number of notifications was wrong. Since we did not consume the ones for the connect, the expected count after the disconnect is 8 instead of 4. - Possible fix GW tests complaining about number of outbound/inbound I think that it may be possible that connection does not succeed right away (remote to fully started, etc) and due to dial timeout and reconnect attempt delay, I suspect that when given a max time of 1sec to complete, it may not be enough. Quick change for now is to override to 2secs for now in the wait helpers. If that proves conclusive, we could remove the timeout given to these helpers. - TestGatewaySendAllSubsBadProtocol: used a t.Fatalf() in checkFor instead of return fmt.Errorf(). - TestLeafNodeResetsMSGProto: this test is not about change to interest mode only, so to avoid possible mix of protos, delay a bit creation of gateway after creation of leaf node. - Some defer s.Shutdown() were missing Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-26 17:17:08 -06:00
Ivan Kozlovic	b325cf1e4a	Fixed loss of queue subscription interest across Gateways in some cases Suppose two servers, SA in cluster A and SB in cluster B. If SA sends a message to SB on an account for which there is no interest at all (account not known or no subscription), SB will send an A- and keep track that it sent an A- for this account. When a queue subscription is created on SB, SB will send and RS+ to A because A needs to have perfect knowledge of all queue subs in all clusters. If then a regular subscription is also created on SB, SB will think that it needs to send an A+ because it had sent an A- for this account. However, SA had an entry for this account for the queue sub. The A+ would clear the entry in the map and would cause SA to not send messages to SB even if they would have been a match for the queue sub on SB. We fix this in two ways: - Clear the possible A- in SB when sending an RS+ for queue sub - Processing of A-/A+ to be aware of a possible entry in the map due to queue subs. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-25 16:27:00 -06:00
Ivan Kozlovic	48c3f7f846	Fixed some flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-05-24 09:53:35 -06:00

1 2 3 4 5 ...

434 Commits