nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-02 11:48:43 -07:00

Author	SHA1	Message	Date
Jaime Piña	4d04f281fc	Randomize leafnode route URLs and add option to disable	2021-04-23 14:59:15 -07:00
Ivan Kozlovic	1014041be3	[FIXED] Possible panic due to concurrent access to unlocked map This could happen when a leafnode has permissions set and another connection (client, etc..) is about to assign a message to the leafnode while the leafnode itself is receiving messages and they both check permissions at the same time. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-20 21:18:13 -06:00
Ivan Kozlovic	6e1205b660	Cleanup some tests + GetTLSConnectionState() race fix Missing defers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-15 11:37:43 -06:00
Ivan Kozlovic	c369a26c03	Fixed leafnode flapper Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-12 09:31:33 -06:00
Jaime Piña	27e9628c3a	Run gofmt -s to simplify code	2021-04-09 15:18:06 -07:00
Ivan Kozlovic	452685b9b1	[FIXED] LeafNode: set first ping timer after receiving CONNECT We were setting the ping timer in the accepting server as soon as the leafnode connection is created, just after sending the INFO and setting the auth timer. Sending a PING too soon may cause the solicit side to process this PING and send a PONG in response, possibly before sending the CONNECT, which the accepting side would fail as an authentication error, since first protocol is expected to be a CONNECT. Since LeafNode always expect a CONNECT, we always set the auth timer. So now on accept, instead of starting the ping timer just after sending the INFO, we will delay setting this timer only after receiving the CONNECT. The auth timer will take care of a stale connection in the time it takes to receives the CONNECT. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-08 14:36:39 -06:00
Jaime Piña	d929ee1348	Check errors when removing test directories and files Currently in tests, we have calls to os.Remove and os.RemoveAll where we don't check the returned error. This hides useful error messages when tests fail to run, such as "too many open files". This change checks for more filesystem related errors and calls t.Fatal if there is an error.	2021-04-07 11:09:47 -07:00
Jaime Piña	e44275b963	Consolidate temporary test files and directories Currently, temporary test files and directories are written in lots of different paths within the OS's temp dir. This makes it hard to know which files are from nats-server and which are unrelated. This in turn makes it hard to clean up nats-server test files.	2021-04-06 10:42:55 -07:00
Ivan Kozlovic	21a9bfa1d8	[FIXED] Leafnode: incorrect loop detection in multi-cluster setup If leafnodes from a cluster were to reconnect to a server in a different cluster, it was possible for that server to send to the leafnodes some their own subscriptions that could cause an inproper loop detection error. There was also a defect that would cause subscriptions over route for leafnode subscriptions to be registered under the wrong key, which would lead to those subscriptions not being properly removed on route disconnect. Finally, during route disconnect, the leafnodes map was not updated. This PR fixes that too. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-05 16:49:37 -06:00
Matthias Hanel	9f753a2475	[fixed] issue where verify_and_map: true in leaf node config was not used (#2038 ) * [fixed] issue where verify_and_map: true in leaf node config was not used This broke the setup in such a way that any connect relying on this would have failed. This also fixes an issue where specifying no account did not result in using $G. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-03-26 19:24:01 -04:00
Ivan Kozlovic	b17f38e356	[FIXED] Websocket: do not generate empty frames + LN corruption - It was possible that when the server was sending frames to a webbrowser, it would send empty frames. While technically not wrong, prevent that from happening. - Not copying enqueued buffers could cause corruption with LN+WS. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-03-26 16:17:46 -06:00
Ivan Kozlovic	eafc6b7a25	[fixed] LeafNode sending message using stream's import subject. A publish on "a" becomes an LMSG on ">" which is the stream import's subject. The subscriber on "a" on the other side did not receive the message. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-02-19 00:11:41 -05:00
Ivan Kozlovic	ac0a1ee8fd	Fixed compression http header request/response The issue was introduced by PR #1858. Key points: - Sec-WebSocket-Extensions must contain approved headers, so moving the "no-masking" private extension to its own header "Nats-No-Masking". - The format of the permessage-deflate negotiation response became invalid, I have fixed that. - For leaf nodes, if `permessage-deflate` extension is not at all present in the response, then simply disable compression, however if it is present but there is no server/client no context take over, then we have to fail the connection. - A leafnode test was not setting the "NoMasking" option so the test TestLeafNodeWSNoMaskingRejected was not capturing possible error if negotiation failed. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-02-01 12:10:37 -07:00
Ivan Kozlovic	9587bf8cd4	Changed option to make masking the default and option to disable it This will allow a better experience if there is a load balancer in between and expects websocket frames to be masked. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-29 11:22:22 -07:00
Ivan Kozlovic	2b8c6e0124	Support for Websocket Leafnode connections Added two options in the remote leaf node configuration - compress, for websocket only at the moment - ws_masking, to force remote leafnode connections to mask websocket frames (default is no masking since it is communication between server to server) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Ivan Kozlovic	131be1cb33	Make TLS client/server handshake helpers function This reduces code duplication Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Ivan Kozlovic	9716aa8b4c	Merge pull request #1846 from nats-io/ln_save_tls_name [FIXED] LeafNode: save hostname that may be used during TLS handshake	2021-01-26 14:51:11 -07:00
Ivan Kozlovic	af57f55738	Fixing some flappers (leafnode and mqtt) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-26 14:23:49 -07:00
Ivan Kozlovic	6666f5aa43	[FIXED] LeafNode: save hostname that may be used during TLS handshake Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-26 12:10:57 -07:00
Ivan Kozlovic	0d78bce9cf	Fixed some leafnode issues introduced from JS cluster work Also fixed a flapper. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-15 12:00:34 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	d5f255b98e	Merge pull request #1771 from nats-io/gw_ln_tls_config_reload [FIXED] Config reload for gateways/leaf remote TLS configurations	2020-12-12 10:51:52 -07:00
Ivan Kozlovic	2d2f85267b	Add fix for TestLeafNodeLoop and others Based on timing, it is possible that the first error is about connection refused as opposed to "Loop detected". So use a dedicated logger to notify only when expected error is found. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-11 18:15:49 -07:00
Ivan Kozlovic	fc1521636c	[FIXED] Config reload for gateways/leaf remote TLS configurations Presence of TLS config in any remote gateway or leafnode would cause the config reload to fail (because TLS config internal content may change which fails the DeepEqual check). This PR excludes the TLS configs in such case to check for changes in gateways and leafnodes. Although GW and LN config reload is technically supported, this PR updates the internal remotes' TLS configuration so that changes/updates to TLS certificates would take effect after a configuration reload. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-11 16:56:25 -07:00
Ivan Kozlovic	77aead807c	Send LS- without origin to route When cluster origin code was added, a server may send LS+ with an origin cluster name in the protocol. Parsing code from a ROUTER connection was adjusted to understand this LS+ protocol. However, the server was also sending an LS- with origin but the parsing code was not able to understand that. When the unsub was for a queue subscription, this would cause the parser to error out and close the route connection. This PR sends an LS- without the origin in this case (so that tracing makes sense in term of LS+/LS- sent to a route). The receiving side then traces appropriate LS- but processes as a normal RS-. Resolves #1751 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-11-30 13:31:32 -07:00
Ivan Kozlovic	88475014ef	[FIXED] Split LMSG across routes If a LeafNode message is sent across a route, and the message does not fit in the buffer, the parser would incorrectly process the "pub args" as if it was a ROUTED message, not a LEAF message. This caused clonePubArg() to return an error that would cause the parser to end with a protocol violation. Keep track that we are processing an LMSG so that we can pass that information to clonePubArg() and do proper parsing in split scenario. Resolves #1743 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-11-24 14:55:50 -07:00
Ivan Kozlovic	120b031ffd	Merge pull request #1739 from nats-io/leaf-warning [Added] account name checks for leaf nodes in operator mode	2020-11-24 12:35:31 -07:00
Matthias Hanel	a8390b7432	Incorporating comments and moving code Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-11-23 23:27:44 -05:00
Matthias Hanel	e2e69b6daf	[Added] account name checks for leaf nodes in operator mode Rules out implausible ones. Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-11-23 15:38:41 -05:00
Ivan Kozlovic	f155c75da7	[FIXED] LeafNode reject duplicate remote There was a test to prevent an errorneous loop detection when a remote would reconnect (due to a stale connection) while the accepting side did not detect the bad connection yet. However, this test was racy because the test was done prior to add the connections to the map. In the case of a misconfiguration where the remote creates 2 different remote connections that end-up binding to the same account in the accepting side, then it was possible that this would not be detected. And when it was, the remote side would be unaware since the disconnect/ reconnect attempts would not show up if not running in debug mode. This change makes sure that the detection is no longer racy and returns an error to the remote so at least the log/console of the remote will show the "duplicate connection" error messages. Resolves #1730 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-11-23 13:28:18 -07:00
Ivan Kozlovic	55b0f8d855	[FIXED] LeafNode: duplicate queue messages in complex routing setup Suppose a cluster of 2 servers, let's call them leaf1 and leaf2. These servers are routed and have a leaf connection to another server, let's call it srv1. They share the same cluster name. If a queue subscriber runs on srv1 and a queue subscriber on the same subject/group name runs on leaf1, if a requestor runs on leaf2, the request should reach only one of the 2 queue subs. The defect was that sometimes both queue subs would receive the message. The added test checks that only one reply is ever received and that the local "leaf" cluster is preferred. Resolves #1722 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-11-18 11:23:08 -07:00
Ivan Kozlovic	2605ae71ed	[FIXED] Prevent LeafNode loop detection on early reconnect If the soliciting side detects the disconnect and attempts to reconnect but the accepting side did not yet close the connection, a "loop detected" error would be reported and the soliciting server would not try to reconnect for 30 seconds. Made a change so that the accepting server checks for existing leafnode connection for the same server and same account, and if it is found, close the "old" connection so it is replaced by the "new" one. Resolves #1606 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-09-22 16:58:36 -06:00
Derek Collison	920617d64a	Updates based on feedback Signed-off-by: Derek Collison <derek@nats.io>	2020-06-26 10:29:53 -07:00
Derek Collison	6c805eebc7	Properly support leadnode clusters. Leafnodes that formed clusters were partially supported. This adds proper support for origin cluster, subscription suppression and data message no echo for the origin cluster. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-26 09:03:22 -07:00
Derek Collison	3541e3f0f9	Updated older tests for new functionality Signed-off-by: Derek Collison <derek@nats.io>	2020-06-16 10:56:39 -07:00
Derek Collison	2b9e3e5b15	Merge pull request #1476 from nats-io/cluster_name Cluster names are now required.	2020-06-15 10:07:30 -07:00
Derek Collison	146d8f5dcb	Updates based on feedback, sped up some slow tests Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 17:26:43 -07:00
Ivan Kozlovic	61cccbce02	[FIXED] LeafNode solicit failure race could leave conn registered This was found due to a recent test that was flapping. The test was not checking the correct server for leafnode connection, but that uncovered the following bug: When a leafnode connection is solicited, the read/write loops are started. Then, the connection lock is released and several functions invoked to register the connection with an account and add to the connection leafs map. The problem is that the readloop (for instance) could get a read error and close the connection before the above said code executes, which would lead to a connection incorrectly registered. This could be fixed either by delaying the start of read/write loops after the registration is done, or like in this PR, check the connection close status after registration, and if closed, manually undoing the registration with account/leafs map. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-12 16:01:13 -06:00
Ivan Kozlovic	d2a8282a0d	[FIXED] LeafNode TLSMap and websocket auth override We added authentication override block for websocket configuration in PR #1463 and #1465 which somehow introduced a drop in perf as reported by the bench tests. This PR refactors a bit to restore the performance numbers. This change also fixes the override behavior for websocket auth: - If websocket's NoAuthUser is configured, the websocket's auth block MUST define Users, and the user be present. - If there is any override (username/pwd,token,etc..) then the whole block config will be used when authenticating a websocket client, which means that if websocket NoAuthUser is empty we are not falling back to the regular client's NoAuthUser config. - TLSMap always override the regular client's config. That is, whatever TLSMap value specified in the websocket's tls{} block will be used. The TLSMap configuration was not used for LeafNodes. The behavior now will be: - If LeafNode's auth block contains users and TLSMap is true, the user is looked up based on the cert's info. If not found, authentication will fail. If found, it will be authenticated and bound to associated account. - If no user is specified in LeafNode's auth block and TLSMap is true, then the cert's info will be used against the global users map. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-06-11 17:06:54 -06:00
Ivan Kozlovic	44e78a1fb6	Fixed some tests - A race test may have consumed a lot of fds going in TIME_WAIT that could cause some issues for other tests - Missing defer filestore.Stop() that would leave flushLoop() routines - A defer for the from server in a LeafNode test - Rework [Re]ConnectErrorReports that was failing often for me locally (probably due to exhaustion of fds - too many TIME_WAIT). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-29 17:47:08 -06:00
Derek Collison	57d8cdb1d1	Fix flapper, wait for subs to propagate Signed-off-by: Derek Collison <derek@nats.io>	2020-05-25 06:58:23 -07:00
Ivan Kozlovic	8f05bc5c46	[FIXED] Possible stall on shutdown with leafnode setup If a leafnode connection is accepted but the server is shutdown before the connection is fully registered, the shutdown would stall because read and write loop go routine would not be stopped. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-22 15:26:04 -06:00
Ivan Kozlovic	c54f41acd6	Fixed flapper test Make sure that the subscription on "service" is registered on server A. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-14 13:27:17 -06:00
Derek Collison	6fa7f1ce82	Have hub role sent to accepting side and adapt to be a spoke Signed-off-by: Derek Collison <derek@nats.io>	2020-04-13 15:18:42 -07:00
Ivan Kozlovic	439dca67c8	Test for interest propagation with GWs and Hub leafnodes Setup: B <- GW -> C / \ v v A D Leafnodes are created from B to A and C to D. The remotes on B and C have the option "Hub: true". The replier connects to D and listens to "service". The requestor connects to "A" and sends the request on "service". The reply does not make it back to A. If the requestor on A, instead of calling Request(), first creates a subscription on an inbox, wait a little bit (few 100s ms), then publishes the request on "service" with that inbox for the reply subject, the reply makes it back to A. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-13 12:47:18 -06:00
Ivan Kozlovic	b200368e52	LeafNode: delay connect even when loop detected by accepting side If the loop is detected by a server accepting the leafnode connection, an error is sent back and connection is closed. This change ensures that the server checks an -ERR for "Loop detected" and then set the connect delay, so that it does not try to reconnect right away. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-10 16:44:16 -06:00
Ivan Kozlovic	34eb5bda31	[ADDED] Deny import/export options for LeafNode remote configuration This will allow a leafnode remote connection to prevent unwanted messages to be received, or prevent local messages to be sent to the remote server. Configuration will be something like: ``` leafnodes { remotes: [ { url: "nats://localhost:6222" deny_imports: ["foo.", "bar"] deny_exports: ["baz.", "bat"] } ] } ``` Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-09 18:55:44 -06:00
Derek Collison	699502de8f	Detection for loops with leafnodes. We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles. This will have the server who completes the loop be the one that detects the error soley based on its own loop detection subject. Otehr changes are just to fix tests that were not waiting for the new LDS sub. Signed-off-by: Derek Collison <derek@nats.io>	2020-04-08 20:00:40 -07:00
Ivan Kozlovic	76e8e1c9b0	[ADDED] Leafnode remote's Hub option This allows a node that creates a remote LeafNode connection to act as it was the hub (of the hub and spoke topology). This is related to subscription interest propagation. Normally, a spoke (the one creating the remote LN connection) will forward only its local subscriptions and when receiving subscription interest would not try to forward to local cluster and/or gateways. If a remote has the Hub boolean set to true, even though the node is the one creating the remote LN connection, it will behave as if it was accepting that connection. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-07 13:42:55 -06:00
Matthias Hanel	6f77a54118	[FIXED] loop detection by checking for duplicate lds subscriptions This is in addition to checking if the own subscription comes back. The duplicated lds subscription must come from a different client. Added unit tests. Also prefixed lds with '$' to mark it as system subject going forward. This moves the loop detection check past other checks. These checks should not trigger in cases where a loop is initially detected. Fixes #1305 Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-03-17 19:06:35 -04:00

1 2

64 Commits