Commit Graph

53 Commits

Author SHA1 Message Date
Ivan Kozlovic
eafc6b7a25 [fixed] LeafNode sending message using stream's import subject.
A publish on "a" becomes an LMSG on ">" which
is the stream import's subject. The subscriber on "a" on the other
side did not receive the message.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-02-19 00:11:41 -05:00
Ivan Kozlovic
ac0a1ee8fd Fixed compression http header request/response
The issue was introduced by PR #1858.

Key points:

- Sec-WebSocket-Extensions must contain approved headers, so moving
the "no-masking" private extension to its own header "Nats-No-Masking".

- The format of the permessage-deflate negotiation response became
invalid, I have fixed that.

- For leaf nodes, if `permessage-deflate` extension is not at all
present in the response, then simply disable compression, however
if it is present but there is no server/client no context take over,
then we have to fail the connection.

- A leafnode test was not setting the "NoMasking" option so the
test TestLeafNodeWSNoMaskingRejected was not capturing possible
error if negotiation failed.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-02-01 12:10:37 -07:00
Ivan Kozlovic
9587bf8cd4 Changed option to make masking the default and option to disable it
This will allow a better experience if there is a load balancer
in between and expects websocket frames to be masked.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-29 11:22:22 -07:00
Ivan Kozlovic
2b8c6e0124 Support for Websocket Leafnode connections
Added two options in the remote leaf node configuration

- compress, for websocket only at the moment
- ws_masking, to force remote leafnode connections to mask websocket
frames (default is no masking since it is communication between
server to server)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-28 13:13:11 -07:00
Ivan Kozlovic
131be1cb33 Make TLS client/server handshake helpers function
This reduces code duplication

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-28 13:13:11 -07:00
Ivan Kozlovic
9716aa8b4c Merge pull request #1846 from nats-io/ln_save_tls_name
[FIXED] LeafNode: save hostname that may be used during TLS handshake
2021-01-26 14:51:11 -07:00
Ivan Kozlovic
af57f55738 Fixing some flappers (leafnode and mqtt)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-26 14:23:49 -07:00
Ivan Kozlovic
6666f5aa43 [FIXED] LeafNode: save hostname that may be used during TLS handshake
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-26 12:10:57 -07:00
Ivan Kozlovic
0d78bce9cf Fixed some leafnode issues introduced from JS cluster work
Also fixed a flapper.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-15 12:00:34 -07:00
Derek Collison
f0cdf89c61 JetStream Clustering WIP
Signed-off-by: Derek Collison <derek@nats.io>
2021-01-14 01:14:52 -08:00
Ivan Kozlovic
d5f255b98e Merge pull request #1771 from nats-io/gw_ln_tls_config_reload
[FIXED] Config reload for gateways/leaf remote TLS configurations
2020-12-12 10:51:52 -07:00
Ivan Kozlovic
2d2f85267b Add fix for TestLeafNodeLoop and others
Based on timing, it is possible that the first error is about
connection refused as opposed to "Loop detected". So use a dedicated
logger to notify only when expected error is found.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-12-11 18:15:49 -07:00
Ivan Kozlovic
fc1521636c [FIXED] Config reload for gateways/leaf remote TLS configurations
Presence of TLS config in any remote gateway or leafnode would
cause the config reload to fail (because TLS config internal
content may change which fails the DeepEqual check).

This PR excludes the TLS configs in such case to check for
changes in gateways and leafnodes.

Although GW and LN config reload is technically supported, this
PR updates the internal remotes' TLS configuration so that
changes/updates to TLS certificates would take effect after
a configuration reload.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-12-11 16:56:25 -07:00
Ivan Kozlovic
77aead807c Send LS- without origin to route
When cluster origin code was added, a server may send LS+ with
an origin cluster name in the protocol. Parsing code from a ROUTER
connection was adjusted to understand this LS+ protocol.
However, the server was also sending an LS- with origin but the
parsing code was not able to understand that. When the unsub was
for a queue subscription, this would cause the parser to error out
and close the route connection.

This PR sends an LS- without the origin in this case (so that tracing
makes sense in term of LS+/LS- sent to a route). The receiving side
then traces appropriate LS- but processes as a normal RS-.

Resolves #1751

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-11-30 13:31:32 -07:00
Ivan Kozlovic
88475014ef [FIXED] Split LMSG across routes
If a LeafNode message is sent across a route, and the message
does not fit in the buffer, the parser would incorrectly process
the "pub args" as if it was a ROUTED message, not a LEAF message.
This caused clonePubArg() to return an error that would cause
the parser to end with a protocol violation.

Keep track that we are processing an LMSG so that we can pass
that information to clonePubArg() and do proper parsing in split
scenario.

Resolves #1743

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-11-24 14:55:50 -07:00
Ivan Kozlovic
120b031ffd Merge pull request #1739 from nats-io/leaf-warning
[Added] account name checks for leaf nodes in operator mode
2020-11-24 12:35:31 -07:00
Matthias Hanel
a8390b7432 Incorporating comments and moving code
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-11-23 23:27:44 -05:00
Matthias Hanel
e2e69b6daf [Added] account name checks for leaf nodes in operator mode
Rules out implausible ones.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-11-23 15:38:41 -05:00
Ivan Kozlovic
f155c75da7 [FIXED] LeafNode reject duplicate remote
There was a test to prevent an errorneous loop detection when a
remote would reconnect (due to a stale connection) while the accepting
side did not detect the bad connection yet.

However, this test was racy because the test was done prior to add
the connections to the map.

In the case of a misconfiguration where the remote creates 2 different
remote connections that end-up binding to the same account in the
accepting side, then it was possible that this would not be detected.
And when it was, the remote side would be unaware since the disconnect/
reconnect attempts would not show up if not running in debug mode.

This change makes sure that the detection is no longer racy and returns
an error to the remote so at least the log/console of the remote will
show the "duplicate connection" error messages.

Resolves #1730

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-11-23 13:28:18 -07:00
Ivan Kozlovic
55b0f8d855 [FIXED] LeafNode: duplicate queue messages in complex routing setup
Suppose a cluster of 2 servers, let's call them leaf1 and leaf2.
These servers are routed and have a leaf connection to another
server, let's call it srv1.
They share the same cluster name.

If a queue subscriber runs on srv1 and a queue subscriber on the
same subject/group name runs on leaf1, if a requestor runs on
leaf2, the request should reach only one of the 2 queue subs.

The defect was that sometimes both queue subs would receive the
message.

The added test checks that only one reply is ever received and
that the local "leaf" cluster is preferred.

Resolves #1722

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-11-18 11:23:08 -07:00
Ivan Kozlovic
2605ae71ed [FIXED] Prevent LeafNode loop detection on early reconnect
If the soliciting side detects the disconnect and attempts to
reconnect but the accepting side did not yet close the connection,
a "loop detected" error would be reported and the soliciting server
would not try to reconnect for 30 seconds.

Made a change so that the accepting server checks for existing
leafnode connection for the same server and same account, and if
it is found, close the "old" connection so it is replaced by
the "new" one.

Resolves #1606

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-09-22 16:58:36 -06:00
Derek Collison
920617d64a Updates based on feedback
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-26 10:29:53 -07:00
Derek Collison
6c805eebc7 Properly support leadnode clusters.
Leafnodes that formed clusters were partially supported. This adds proper support for origin cluster, subscription suppression and data message no echo for the origin cluster.

Signed-off-by: Derek Collison <derek@nats.io>
2020-06-26 09:03:22 -07:00
Derek Collison
3541e3f0f9 Updated older tests for new functionality
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-16 10:56:39 -07:00
Derek Collison
2b9e3e5b15 Merge pull request #1476 from nats-io/cluster_name
Cluster names are now required.
2020-06-15 10:07:30 -07:00
Derek Collison
146d8f5dcb Updates based on feedback, sped up some slow tests
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-12 17:26:43 -07:00
Ivan Kozlovic
61cccbce02 [FIXED] LeafNode solicit failure race could leave conn registered
This was found due to a recent test that was flapping. The test
was not checking the correct server for leafnode connection, but
that uncovered the following bug:

When a leafnode connection is solicited, the read/write loops are
started. Then, the connection lock is released and several
functions invoked to register the connection with an account and
add to the connection leafs map.
The problem is that the readloop (for instance) could get a read
error and close the connection *before* the above said code
executes, which would lead to a connection incorrectly registered.

This could be fixed either by delaying the start of read/write loops
after the registration is done, or like in this PR, check the
connection close status after registration, and if closed, manually
undoing the registration with account/leafs map.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-12 16:01:13 -06:00
Ivan Kozlovic
d2a8282a0d [FIXED] LeafNode TLSMap and websocket auth override
We added authentication override block for websocket configuration
in PR #1463 and #1465 which somehow introduced a drop in perf as
reported by the bench tests.
This PR refactors a bit to restore the performance numbers.

This change also fixes the override behavior for websocket auth:
- If websocket's NoAuthUser is configured, the websocket's auth
  block MUST define Users, and the user be present.
- If there is any override (username/pwd,token,etc..) then the
  whole block config will be used when authenticating a websocket
  client, which means that if websocket NoAuthUser is empty we
  are not falling back to the regular client's NoAuthUser config.
- TLSMap always override the regular client's config. That is,
  whatever TLSMap value specified in the websocket's tls{} block
  will be used.

The TLSMap configuration was not used for LeafNodes. The behavior
now will be:
- If LeafNode's auth block contains users and TLSMap is true,
  the user is looked up based on the cert's info. If not found,
  authentication will fail. If found, it will be authenticated
  and bound to associated account.
- If no user is specified in LeafNode's auth block and TLSMap
  is true, then the cert's info will be used against the global
  users map.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-11 17:06:54 -06:00
Ivan Kozlovic
44e78a1fb6 Fixed some tests
- A race test may have consumed a lot of fds going in TIME_WAIT
that could cause some issues for other tests
- Missing defer filestore.Stop() that would leave flushLoop()
routines
- A defer for the from server in a LeafNode test
- Rework [Re]ConnectErrorReports that was failing often for me
locally (probably due to exhaustion of fds - too many TIME_WAIT).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-29 17:47:08 -06:00
Derek Collison
57d8cdb1d1 Fix flapper, wait for subs to propagate
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 06:58:23 -07:00
Ivan Kozlovic
8f05bc5c46 [FIXED] Possible stall on shutdown with leafnode setup
If a leafnode connection is accepted but the server is shutdown
before the connection is fully registered, the shutdown would
stall because read and write loop go routine would not be
stopped.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-22 15:26:04 -06:00
Ivan Kozlovic
c54f41acd6 Fixed flapper test
Make sure that the subscription on "service" is registered on
server A.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-04-14 13:27:17 -06:00
Derek Collison
6fa7f1ce82 Have hub role sent to accepting side and adapt to be a spoke
Signed-off-by: Derek Collison <derek@nats.io>
2020-04-13 15:18:42 -07:00
Ivan Kozlovic
439dca67c8 Test for interest propagation with GWs and Hub leafnodes
Setup:

  B <- GW -> C
 /            \
v              v
A              D

Leafnodes are created from B to A and C to D. The remotes on B and
C have the option "Hub: true".

The replier connects to D and listens to "service". The requestor
connects to "A" and sends the request on "service". The reply does
not make it back to A.
If the requestor on A, instead of calling Request(), first creates
a subscription on an inbox, wait a little bit (few 100s ms), then
publishes the request on "service" with that inbox for the reply
subject, the reply makes it back to A.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-04-13 12:47:18 -06:00
Ivan Kozlovic
b200368e52 LeafNode: delay connect even when loop detected by accepting side
If the loop is detected by a server accepting the leafnode connection,
an error is sent back and connection is closed.
This change ensures that the server checks an -ERR for "Loop detected"
and then set the connect delay, so that it does not try to reconnect
right away.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-04-10 16:44:16 -06:00
Ivan Kozlovic
34eb5bda31 [ADDED] Deny import/export options for LeafNode remote configuration
This will allow a leafnode remote connection to prevent unwanted
messages to be received, or prevent local messages to be sent
to the remote server.

Configuration will be something like:
```
leafnodes {
  remotes: [
    {
      url: "nats://localhost:6222"
      deny_imports: ["foo.*", "bar"]
      deny_exports: ["baz.*", "bat"]
    }
  ]
}
```

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-04-09 18:55:44 -06:00
Derek Collison
699502de8f Detection for loops with leafnodes.
We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles.
This will have the server who completes the loop be the one that detects the error soley based on
its own loop detection subject.

Otehr changes are just to fix tests that were not waiting for the new LDS sub.

Signed-off-by: Derek Collison <derek@nats.io>
2020-04-08 20:00:40 -07:00
Ivan Kozlovic
76e8e1c9b0 [ADDED] Leafnode remote's Hub option
This allows a node that creates a remote LeafNode connection to
act as it was the hub (of the hub and spoke topology). This is
related to subscription interest propagation. Normally, a spoke
(the one creating the remote LN connection) will forward only
its local subscriptions and when receiving subscription interest
would not try to forward to local cluster and/or gateways.
If a remote has the Hub boolean set to true, even though the
node is the one creating the remote LN connection, it will behave
as if it was accepting that connection.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-04-07 13:42:55 -06:00
Matthias Hanel
6f77a54118 [FIXED] loop detection by checking for duplicate lds subscriptions
This is in addition to checking if the own subscription comes back.
The duplicated lds subscription must come from a different client.
Added unit tests.
Also prefixed lds with '$' to mark it as system subject going forward.

This moves the loop detection check past other checks.
These checks should not trigger in cases where a loop is initially detected.

Fixes #1305

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-03-17 19:06:35 -04:00
Ivan Kozlovic
27ae160f75 Use CID and LeafNodeURLs as an indicator connected to proper port
First, the test should be done only for the initial INFO and only
for solicited connections. Based on the content of INFO coming
from different "listen ports", use the CID and LeafNodeURLs for
the indication that we are connected to the proper port.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-01-29 14:43:41 -07:00
Waldemar Quevedo
ecb5008fe3 Add check prevent leafnode connecting to client port
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2020-01-28 12:43:27 -08:00
Ivan Kozlovic
a22da91647 [FIXED] Closing of Gateway or Route TLS connection may hang
This could happen if the remote server is running but not dequeueing
from the socket. TLS connection Close() may send/read and so we
need to protect with a deadline.

For non client/leaf connection, do not call flushOutbound().
Set the write deadline regardless of handshakeComplete flag, and
set it to a low value.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-12-04 17:27:00 -07:00
Ivan Kozlovic
279cab2aaf [FIXED] Detect loop between LeafNode servers
This is achieved by subscribing to a unique subject. If the LS+
protocol is coming back for the same subject on the same account,
then this indicates a loop.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-29 16:14:35 -06:00
Ivan Kozlovic
07bf4a499e Issue with multiple users in Leafnode authorization
This was introduced in master #1147, not in any public release.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-29 13:34:30 -06:00
Ivan Kozlovic
18a1702ba2 [ADDED] Basic auth for leafnodes
Added a way to specify which account an accepted leafnode connection
should be bound to when using simple auth (user/password).

Singleton:
```
leafnodes {
  port: ...
  authorization {
    user: leaf
    password: secret
    account: TheAccount
  }
}
```
With above configuration, if a soliciting server creates a LN connection
with url: `nats://leaf:secret@host:port`, then the accepting server
will bind the leafnode connection to the account "TheAccount". This account
need to exist otherwise the connection will be rejected.

Multi:
```
leafnodes {
  port: ...
  authorization {
    users = [
      {user: leaf1, password: secret, account: account1}
      {user: leaf2, password: secret, account: account2}
    ]
  }
}
```
With the above, if a server connects using `leaf1:secret@host:port`, then
the accepting server will bind the connection to account `account1`.

If user/password (either singleton or multi) is defined, then the connecting
server MUST provide the proper credentials otherwise the connection will
be rejected.

If no user/password info is provided, it is still possible to provide the
account the connection should be associated with:
```
leafnodes {
  port: ...
  authorization {
    account: TheAccount
  }
}
```
With the above, a connection without credentials will be bound to the
account "TheAccount".

If credentials are used (jwt, nkey or other), then the server will attempt
to authenticate and if successful associate to the account for that specific
user. If the user authentication fails (wrong password, no such user, etc..)
the connection will be also rejected.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-30 19:42:11 -06:00
Ivan Kozlovic
cd9f898eb0 Made a server's helper to set first ping timer
Defaults to 1sec but will be opts.PingInterval if value is lower.
All non client connections invoked this function for the first
PING.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 10:21:43 -06:00
Ivan Kozlovic
90d592e163 Leaf and Route RTT
When a leaf or route connection is created, set the first ping
timer to fire at 1sec, which will allow to compute the RTT
reasonably soon (since the PingInterval could be user configured
and set much higher).

For Route in PR #1101, I was sending the PING on receiving the
INFO which required changing bunch of tests. Changing that to
also use the first timer interval of 1sec and reverted changes
to route tests.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 09:34:17 -06:00
Ivan Kozlovic
7ca8723942 [FIXED] Some Leafnode issues
- On startup, verify that local account in leafnode (if specified
  can be found otherwise fail startup).
- At runtime, print error and continue trying to reconnect.
  Will need to decide a better approach.
- When using basic auth (user/password), it was possible for a
  solicited Leafnode connection to not use user/password when
  trying an URL that was discovered through gossip. The server
  now saves the credentials of a configured URL to use with
  the discovered ones.

Updated RouteRTT test in case RTT does not seem to be updated
because getting always the same value.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-23 14:08:07 -06:00
Derek Collison
10d4f1ab7a Convert leafnode solicited remotes to array
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-10 11:53:34 -07:00
Waldemar Quevedo
8147adc1b0 Add support to extend leafnodes remote tls timeout
Bump default TLS timeout for leafnode connections

Add checks for when cert_file or key_file are missing in TLS config

Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2019-06-14 08:04:44 -07:00