nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-02 11:48:43 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	b36672a6bc	Fixed flapper Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-12 16:51:40 -06:00
aricart	e7590f3065	jwt2 testbed	2020-06-01 18:00:13 -04:00
Derek Collison	2bd7553c71	System Account on by default. Most of the changes are to turn it off for tests that were watching subscriptions and such. Signed-off-by: Derek Collison <derek@nats.io>	2020-05-29 17:56:45 -07:00
Ivan Kozlovic	e9805a3109	[FIXED] Possible removal of interest on queue subs with leaf nodes Server was incorrectly processing a queue subscription removal as both a plain sub and queue sub, which may have resulted in drop of interest even when some queue subs remained. Resolves #1421 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-28 10:21:51 -06:00
Derek Collison	79ea95fe44	Fix flapper, wait for sub to propagate Signed-off-by: Derek Collison <derek@nats.io>	2020-05-25 06:58:23 -07:00
Ivan Kozlovic	5dba3cdd75	[FIXED] Race condition during implicit Gateway reconnection Say server in cluster A accepts a connection from a server in cluster B. The gateway is implicit, in that A does not have a configured remote gateway to B. Then the server in B is shutdown, which A detects and initiate a single reconnect attempt (since it is implicit and if the reconnect retries is not set). While this happens, a new server in B is restarted and connects to A. If that happens before the initial reconnect attempt failed, A will register that new inbound and do not attempt to solicit because it has already a remote entry for gateway B. At this point when the reconnect to old server B fails, then the remote GW entry is removed, and A will not create an outbound connection to the new B server. We fix that by checking if there is a registered inbound when we get to the point of removing the remote on a failed implicit reconnect. If there is one, we try the reconnection. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-22 13:01:17 -06:00
Derek Collison	915e3cd74e	Header support for Leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:33:56 -07:00
R.I.Pienaar	63845b8577	add type hints to service latency, use time.Time for timestamp Signed-off-by: R.I.Pienaar <rip@devco.net>	2020-05-19 14:26:46 -07:00
Derek Collison	ea5e5bd364	Services rewrite #2 This contains a rewrite to the services layer for exporting and importing. The code this merges to already had a first significant rewrite that moved from special interest processing to plain subscriptions. This code changes the prior version's dealing with reverse mapping which was based mostly on thresholds and manual pruning, with some sporadic timer usage. This version uses the jetstream branch's code that understands interest and failed deliveries. So this code is much more tuned to reacting to interest changes. It also removes thresholds and goes only by interest changes or expirations based around a new service export property, response thresholds. This allows a service provider to provide semantics on how long a response should take at a maximum. This commit also introduces formal support for service export streamed and chunked response types send an empty message to signify EOF. This commit also includes additions to the service latency tracking such that errors are now sent, not only successful interactions. We have added a Status field and an optional Error fields to ServiceLatency. We support the following Status codes, these are directly from HTTP. 400 Bad Request (request did not have a reply subject) 408 Request Timeout (when system detects request interest went away, old request style to make dependable).. 503 Service Unavailable (no service responders running) 504 Service Timeout (The new response threshold expired) Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:26:46 -07:00
Derek Collison	7f458282b3	Double check we receive on the correct subject Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:20:02 -07:00
Derek Collison	d2ff4311d4	Rebase with master, updates to go.mod and vendor, bumped version Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:20:02 -07:00
Ivan Kozlovic	1cf21fc4ee	Fix some leafnode test flappers Make use of some existing helpers and add checkFor in some places since accounting updates may not be instantaneous. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-15 15:15:26 -06:00
Derek Collison	a301d6731b	Re-order client close Signed-off-by: Derek Collison <derek@nats.io>	2020-04-14 09:54:57 -07:00
Derek Collison	aff10aa16b	Fix for #1344 Signed-off-by: Derek Collison <derek@nats.io>	2020-04-14 09:26:35 -07:00
Derek Collison	ef85a1b836	Fix for #1336 Signed-off-by: Derek Collison <derek@nats.io>	2020-04-10 17:30:03 -07:00
Matthias Hanel	e8ce738808	Test of service across accounts and leaf node. Tests #1336 Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-04-10 15:55:10 -04:00
Derek Collison	f9d9ac193a	Use prefix to make sure we use right subject Signed-off-by: Derek Collison <derek@nats.io>	2020-04-10 10:49:05 -07:00
Derek Collison	090abc939d	Fix for stream imports and leafnodes, #1332 Signed-off-by: Derek Collison <derek@nats.io>	2020-04-10 10:36:20 -07:00
Derek Collison	e843a27bba	When a responder was on a leaf node and the requestor was connected to the same server as the leafnode we did not propagate the service reply wildcard properly. This fixes that. Signed-off-by: Derek Collison <derek@nats.io>	2020-04-10 08:35:09 -07:00
Derek Collison	699502de8f	Detection for loops with leafnodes. We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles. This will have the server who completes the loop be the one that detects the error soley based on its own loop detection subject. Otehr changes are just to fix tests that were not waiting for the new LDS sub. Signed-off-by: Derek Collison <derek@nats.io>	2020-04-08 20:00:40 -07:00
Derek Collison	82f585d83a	Updated to also resend leafnode connect on GW connect via first INFO Signed-off-by: Derek Collison <derek@nats.io>	2020-04-08 09:55:19 -07:00
Derek Collison	43fbe0ffed	This commit allows new servers ina supercluster to be informed of accounts with active leafnode connections. This is needed to put those accounts into interest only mode for inbound gateway connections. Also added code to make sure we were doing proper account tracking and would track the global account as well, which used to be excluded. Fixes #977 Signed-off-by: Derek Collison <derek@nats.io>	2020-04-07 16:22:15 -07:00
Matthias Hanel	6f77a54118	[FIXED] loop detection by checking for duplicate lds subscriptions This is in addition to checking if the own subscription comes back. The duplicated lds subscription must come from a different client. Added unit tests. Also prefixed lds with '$' to mark it as system subject going forward. This moves the loop detection check past other checks. These checks should not trigger in cases where a loop is initially detected. Fixes #1305 Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-03-17 19:06:35 -04:00
Matthias Hanel	68efc95a60	Modifying unit test error message to hint at ulimit -n possibly being too low Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-03-04 14:30:35 -05:00
Ivan Kozlovic	47b08335a4	[FIXED] Reset of tlsName only for x509.HostnameError For issue #1256, we cleared the possibly saved tlsName on Hanshake failure. However, this meant that for normal use cases, if a reconnect failed for any reason we would not be able to reconnect if it is an IP until we get back to the URL that contained the hostname. We now clear only if the handshake error is of x509.HostnameError type, which include errors such as: ``` "x509: Common Name is not a valid hostname: <x>" "x509: cannot validate certificate for <x> because it doesn't contain any IP SANs" "x509: certificate is not valid for any names, but wanted to match <x>" "x509: certificate is valid for <x>, not <y>" ``` Applied the same logic to solicited gateway connections, and fixed the fact that the tlsConfig should be cloned (since we set the ServerName). I have also made a change for leafnode connections similar to what we are doing for gateway connections, which is to use the saved tlsName only if tlsConfig.ServerName is empty, which may not be the case for users that embed NATS Server and pass directly tls configuration. In other words, if the option TLSConfig.ServerName is not empty, always use this value. Relates to #1256 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-01-28 13:16:38 -07:00
Derek Collison	643e73c0c5	Fix for #1256 , mixed IP and DNS for cluster and TLS with leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2020-01-22 11:25:09 -08:00
Ivan Kozlovic	947798231b	[UPDATED] TCP Write and SlowConsumer handling - All writes will now be done by the writeLoop, unless when the writeLoop has not been started yet (likely in connection init). - Slow consumers for non CLIENT connections will be reported but not failed. The idea is that routes, gateway, etc.. connections should stay connected as much as possible. However if a flush operation times out and no data at all has been written, the connection will be closed (regardless of type). - Slow consumers due to max pending is only for CLIENT connections. This allows sending of SUBs through routes, etc.. to not have to be chunked. - The backpressure to CLIENT connections is increased (up to 1sec) based on the sub's connection pending bytes level. - Connection is flushed on close from the writeLoop as to not block the "fast path". Some tests have been fixed and adapted since now closeConnection() is not flushing/closing/removing connection in place. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-31 15:06:27 -07:00
Ivan Kozlovic	1b2754475b	Refactor async client tests Updated all tests that use "async" clients. - start the writeLoop (this is in preparation for changes in the server that will not do send-in-place for some protocols, such as PING, etc..) - Added missing defers in several tests - fixed an issue in client.go where test was wrong possibly causing a panic. - Had to skip a test for now since it would fail without server code change. The next step will be ensure that all protocols are sent through the writeLoop and that the data is properly flushed on close (important for -ERR for instance). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-12 11:58:24 -07:00
Ivan Kozlovic	63138509f7	Tune some code/test for Windows Running test suite on a Windows VM, I notice several failures. Updated the compute of the RTT to be at least 1ns. I think that this is just an issue with the VM I am running, but that change will have no impact for normal situations (since setting the rtt to the very minimum duration (1ns) instead of 0) and will prevent some tests from failing. Because of those same timer granularity issues, I had to add some delays between some actions in order for time.Sub()/Since() to actually report something more than 0. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-21 14:32:46 -07:00
Ivan Kozlovic	977c290bf2	[FIXED] Handling of split buffer for LEAF messages Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-18 11:55:18 -07:00
Derek Collison	07253c0517	Merge pull request #1196 from nats-io/daisy Allow interest propagation with daisy-chained leafnodes	2019-11-17 17:46:23 -08:00
Derek Collison	07da68ce56	Allow interest propagation with daisy chained leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2019-11-17 17:35:20 -08:00
Ivan Kozlovic	e0bc81d0ed	Make the Leafnode internal sub on _GR_.> This is needed for mapped gateway replies. We had used an extra token when implementing the new prefix, but it was then removed, but the leafnode subscription on _GR_...*.> was not updated. We now subscribe on _GR_.> There was a test that was passing because we were using inboxes that caused the pattern to match. Replaced with single token reply so that it would have caught this bug. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-17 17:37:09 -07:00
Derek Collison	3330820502	Fixed a bug where we leaked service imports. Also prior this would have leaked subscriptions as well. Signed-off-by: Derek Collison <derek@nats.io>	2019-11-14 13:29:17 -08:00
Ivan Kozlovic	aa843945c9	Work on Gateways reply mapping - New prefix that includes origin server for the request - Mapping done if request is service import or requestor has recent subscription - Subscription considered recent if less than 250ms - Destination server strip GW prefix before giving to client and restore when getting a reply on that subject - Mapping removed aftert 250ms - Server rejects client publish on "$GNR." (the new prefix) - Cluster and server hash are now 8 chars long and from base 62 alphabets - Mapped replies need to be sent to leafnode servers due to race (cluster B sends RS+ on GW inbound then RMSG on outbound, the RS+ may be processed later and cluster A may have given message to LN before RS+ on reply subject. So LN needs to accept the mapped reply but will strip to give to client and reassemble before sending it back) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-06 16:06:49 -07:00
Ivan Kozlovic	cbbc21ac25	Some update to leafnode subscription handling - Send all subs in place if smap is small - Skip sending update until after sendAllLeafSubs() is done Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-10-30 20:01:49 -06:00
Ivan Kozlovic	279cab2aaf	[FIXED] Detect loop between LeafNode servers This is achieved by subscribing to a unique subject. If the LS+ protocol is coming back for the same subject on the same account, then this indicates a loop. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-10-29 16:14:35 -06:00
Derek Collison	52430c304a	System level services for debugging. This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems. The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards. Signed-off-by: Derek Collison <derek@nats.io>	2019-09-17 09:37:35 -07:00
Derek Collison	7989118c3f	First pass latency tracking for exported services Signed-off-by: Derek Collison <derek@nats.io>	2019-08-30 10:52:48 -07:00
Ivan Kozlovic	7ca8723942	[FIXED] Some Leafnode issues - On startup, verify that local account in leafnode (if specified can be found otherwise fail startup). - At runtime, print error and continue trying to reconnect. Will need to decide a better approach. - When using basic auth (user/password), it was possible for a solicited Leafnode connection to not use user/password when trying an URL that was discovered through gossip. The server now saves the credentials of a configured URL to use with the discovered ones. Updated RouteRTT test in case RTT does not seem to be updated because getting always the same value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-23 14:08:07 -06:00
Ivan Kozlovic	77c63dbce1	Fix flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-08-20 17:07:22 -06:00
Alberto Ricart	273e5af0a8	Fixed an issue where the leaf authentication was not checking for account/signers, so user JWTs signed by a signer failed authentication.	2019-07-17 16:03:55 -04:00
Ivan Kozlovic	0873b46f67	[FIXED] LeafNode urls may be missing in INFO sent to LN connections When a cluster of servers are having routes to each other, there is a chance that the list of leafnode URLs maintained on each server is not complete. This would result in LN servers connecting to this cluster to not get the full list of possible URLs the server could reconnect to. Also fixed a DATA RACE that appeared when running the updated TestLeafNodeInfoURLs test. Fixed the race and added specific test that easily demonstrated the race: TestLeafNodeNoRaceGeneratingNonce Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-07-12 19:15:30 -06:00
Derek Collison	a795920dc3	Report authorization error and use TLS hostname for IPs on leafnodes. Signed-off-by: Derek Collison <derek@nats.io>	2019-07-12 13:57:16 -07:00
Derek Collison	951ae49100	Prevent multiple solicited leafnodes from forming cycles. When a solicited leafnode comes from multiple servers that themselves are a cluster, cycles were formed. This change allows solicited leafnodes to behave similar to gateways in that each server of a cluster is expected to have a solicted leafnode per destination account and cluster. We no longer forward subscription interest or messages to a cluster from a server that has a solicited leafnode. Signed-off-by: Derek Collison <derek@nats.io>	2019-07-10 20:16:47 -07:00
Derek Collison	10d4f1ab7a	Convert leafnode solicited remotes to array Signed-off-by: Derek Collison <derek@nats.io>	2019-07-10 11:53:34 -07:00
Derek Collison	a61d32a82c	Test for staggered leafnodes and sub/pub. Verifies fix for #1066 Signed-off-by: Derek Collison <derek@nats.io>	2019-07-10 09:57:43 -07:00
Derek Collison	49707317a1	Make sure we route responses across leafnodes Signed-off-by: Derek Collison <derek@nats.io>	2019-07-08 16:20:40 -07:00
Derek Collison	100d0d2b02	Use default port for leafnode remote if not specified Signed-off-by: Derek Collison <derek@nats.io>	2019-06-29 17:50:21 -07:00
Derek Collison	d1a782e014	Messages not distributed evenly when sourced from leafnode. When messages came from a leafnode there were not being distributed evenly to the destination cluster. Signed-off-by: Derek Collison <derek@nats.io>	2019-06-11 20:37:49 -07:00

1 2

66 Commits