Commit Graph

143 Commits

Author SHA1 Message Date
David Laban
1d5cc21c3c make error actionable when adding operator+leafnodes
There are many examples in the documentation for one half of this configuration or the other,
but none which configure a leafnode remote on an operator-authenticated cluster.

The error "operator mode requires account nkeys in remotes." is not very clear or actionable.
2021-08-18 18:07:53 +01:00
Ivan Kozlovic
d7a124baaf [FIXED] LeafNode with "wss://.." url was not always initiating TLS
If the remote did not have any TLS configuration, the URL scheme
"wss://" was not used as the indicating that the connection should
be attempted as a TLS connection, causing "invalid websocket connection"
in the server attempting to create the remote leafnode connection.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-08-15 12:39:10 -06:00
Matthias Hanel
db9fd45be2 [fixed] issue where js overwrote leafnode remotes permissions from creds
Fixes #2415. We did a set instead of merge.
changes in `jwt_test.go` are to make the `createUserWithLimit` usable by my new test.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-08-12 12:57:50 -04:00
Derek Collison
09012f0847 Fix for #2401.
Leafnodes with same domain and shared system account should behave like flat jetstream network.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-04 07:07:29 -07:00
Derek Collison
925a6fe6b2 Fix for #2388. Leafnodes with no JS can seamlessly access a HUB with JS.
This is the reverse of the early work to have LNs extend a non-JS cluster.
Also have mixed mode tests as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-01 14:57:47 -07:00
Matthias Hanel
a40ea298e5 [fixed] jetstream unique server name requirement across domains (#2378)
* [fixed] jetstream unique server name requirement across domains

including domain in server info
adding check for cluster name in duplicate leaf node connection check

This does not address non unique domains in the same domain, say within
super cluster.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-07-27 18:42:19 -04:00
Matthias Hanel
9f6ba90d3e [fixing] leafnode missing retry and service export interest propagation (#2288)
* [fixing] leafnode missing retry and service export interest propagation

A missing account on initial connect attempt caused the leaf node
connection to never be established.

An account service import subscription was not propagated along leaf
node connections.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-06-17 19:10:05 -04:00
Matthias Hanel
83389db226 [fixed] hanging leaf node connection when account can't be found (#2267)
* [fixed] hanging leaf node connection when account can't be found

as a result of the issue, the leaf node connection never got created,
even after the account can be found.

Also tracing account id and name (when available)

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-06-10 11:55:16 -04:00
Ivan Kozlovic
c45f4f0353 [FIXED] LeafNode config reload failed without any change made
Issuing a configuration reload for a leafnode that has remotes
defined with remotes having more than 1 url could lead to a failure.
This is because we have introduced shuffling of remote urls but
that was done in the server's options object, which then would
cause the DeepEqual when diff'ing options to fail.
We move the suffling to the private list of urls.

The other issue was that the "old" remote option may not have
had a local account and it was not set to "$G", which could make
the DeepEqual fail.

Resolves #2273

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-06-09 16:12:39 -06:00
Derek Collison
30fae4f960 Changes to leafnodes to support multiple domains where the hub is JetStream enabled but the hub account is not, and the leafnode is.
We were incorrectly shutting things down via deny clauses when detecting the remote side/hub had JetStream capabilities.
This change moves that logic to the remote side and is signalled off the connect message which let's the remote side know
if the local leafnode account has JetStream enabled.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-07 08:39:11 -07:00
Matthias Hanel
b1dee292e6 [changed] pinned certs to check the server connected to as well (#2247)
* [changed] pinned certs to check the server connected to as well

on reload clients with removed pinned certs will be disconnected.
The check happens only on tls handshake now.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-05-24 17:28:32 -04:00
Matthias Hanel
6f6f22e9a7 [added] pinned_cert option to tls block hex(sha256(spki)) (#2233)
* [added] pinned_cert option to tls block hex(sha256(spki))

When read form config, the values are automatically lower cased.
The check when seeing the values programmatically requires 
lower case to avoid having to alter the map at this point.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-05-20 17:00:09 -04:00
Derek Collison
ea5cddd590 Moved the JetStream logic for solicited leafnodes to after we receive first info.
We needed access to the other side's JetStream status.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:46:32 -06:00
Derek Collison
8499376575 Add in support for JetStream domains.
This allows a domain to be set in the JetStream server block that sets a domain name.
Once set this signals that any leafnode connections should operate as separate JetStream domains.
Each domain <NAME> is accessible via "$JS.<NAME>.API.>", even when connected to the same domain.
Also for mixed mode you can set a jetstream block now that defines a domain but specifies "enabled: false".

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:46:32 -06:00
Derek Collison
bfd0e00271 Fix data race
Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:45:27 -06:00
Derek Collison
df664e780e Rework auto insertion of deny exports and imports for leafnodes.
This shifts to runtime vs setup time.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:45:27 -06:00
Derek Collison
0bd92e85da Add in formal support for multiple JetStream domains across leafnodes.
This CL adds in support for multiple JetStream domains using mapped subjects.
Mapping subjects aligns well with the JetStream context APIPrefix in clients.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:45:27 -06:00
Derek Collison
a9c591533c Move to Info vs Warn
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-30 15:39:09 -07:00
Derek Collison
ba31bb6165 When detecting a jetStream domain that is extended to a leafnode or leafnode cluster
we want to auto-suppress JetStream traffic on normal accounts.

We also now track remote accounts so that client info headers can be remapped.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-30 15:23:12 -07:00
Ivan Kozlovic
e2e3de9977 [FIXED] Message loop with cluster, leaf nodes and queue subs
In a setup with a cluster of servers to which 2 different leaf nodes
attach to, and queue subs are attached to one of the leaf, if the
leaf server is restarted and reconnects to another server in the
cluster, there was a risk for an infinite message loop between
some servers in the "hub" cluster.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-04-28 17:11:51 -06:00
Jaime Piña
4d04f281fc Randomize leafnode route URLs and add option to disable 2021-04-23 14:59:15 -07:00
Jaime Piña
e12181cb83 Return not ready for connection reason
Currently, we use ReadyForConnections in server tests to wait for the
server to be ready. However, when this fails we don't get a clue about
why it failed.

This change adds a new unexported method called readyForConnections that
returns an error describing which check failed. The exported
ReadyForConnections version works exactly as before. The unexported
version gets used in internal tests only.
2021-04-20 11:45:08 -07:00
Matthias Hanel
b3e355c263 [fixed] sub ref count issue across leaf node connections
This was caused by not sending subs across leaf node connections in some
cases but sending unsub in all cases. This imbalance caused
subscriptions to go away too soon. (ref count was off)

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-15 20:13:57 -04:00
Matthias Hanel
9486722e96 [fixing] subscription issue when subscribing to a super set of deny_import
If the subscription was foo. > but the server also had an import deny of foo.bar
It was legal to send the subscription. But the other server was unaware
of the restriction and sent the message anyway. The check of the
incoming message did not happen.

Fixing by ignoring messages the server is not supposed to receive.
And exchange deny_import so that the non soliciting leaf node knows to not
send these messages in the first place.

NB. merging of deny_ export/import with perms from INFO happens in processLeafnodeInfo

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-12 20:09:55 -04:00
Matthias Hanel
da4430fc8d using clearTimer(&c.ping.tmr) for cleanup
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-09 16:53:06 -04:00
Matthias Hanel
f7a772f097 Ensure that leafNodeFinishConnectProcess is only executed once.
incorporate review comments

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-09 16:53:06 -04:00
Matthias Hanel
5d1f36dd17 [Fixed] leaf node subscription permission negotiation.
On connect all subscription where sent by the soliciting leaf node.
If creds contains sub deny permissions, the leaf node would be
disconnected.
This waits for the permissions to be exchanged and checks permissions
before sending subscriptions.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-09 16:53:06 -04:00
Ivan Kozlovic
452685b9b1 [FIXED] LeafNode: set first ping timer after receiving CONNECT
We were setting the ping timer in the accepting server as soon
as the leafnode connection is created, just after sending
the INFO and setting the auth timer.

Sending a PING too soon may cause the solicit side to process
this PING and send a PONG in response, possibly before sending
the CONNECT, which the accepting side would fail as an authentication
error, since first protocol is expected to be a CONNECT.

Since LeafNode always expect a CONNECT, we always set the auth
timer. So now on accept, instead of starting the ping timer just
after sending the INFO, we will delay setting this timer only
after receiving the CONNECT.

The auth timer will take care of a stale connection in the time
it takes to receives the CONNECT.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-04-08 14:36:39 -06:00
Ivan Kozlovic
21a9bfa1d8 [FIXED] Leafnode: incorrect loop detection in multi-cluster setup
If leafnodes from a cluster were to reconnect to a server in
a different cluster, it was possible for that server to send
to the leafnodes some their own subscriptions that could cause
an inproper loop detection error.

There was also a defect that would cause subscriptions over route
for leafnode subscriptions to be registered under the wrong key,
which would lead to those subscriptions not being properly removed
on route disconnect.

Finally, during route disconnect, the leafnodes map was not updated.
This PR fixes that too.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-04-05 16:49:37 -06:00
Ivan Kozlovic
b17f38e356 [FIXED] Websocket: do not generate empty frames + LN corruption
- It was possible that when the server was sending frames to a
webbrowser, it would send empty frames. While technically not wrong,
prevent that from happening.
- Not copying enqueued buffers could cause corruption with LN+WS.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-03-26 16:17:46 -06:00
Matthias Hanel
c50ee2a1c6 [Changed] all times exposed will be computed in UTC (#1943)
This also applies to times that end up in that json.
Where applicable moved time.Now() to where it is used.
Moved calls to .UTC() to where time is created it that time is converted
later anyway.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-03-02 21:37:42 -05:00
Ivan Kozlovic
ac0a1ee8fd Fixed compression http header request/response
The issue was introduced by PR #1858.

Key points:

- Sec-WebSocket-Extensions must contain approved headers, so moving
the "no-masking" private extension to its own header "Nats-No-Masking".

- The format of the permessage-deflate negotiation response became
invalid, I have fixed that.

- For leaf nodes, if `permessage-deflate` extension is not at all
present in the response, then simply disable compression, however
if it is present but there is no server/client no context take over,
then we have to fail the connection.

- A leafnode test was not setting the "NoMasking" option so the
test TestLeafNodeWSNoMaskingRejected was not capturing possible
error if negotiation failed.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-02-01 12:10:37 -07:00
Ivan Kozlovic
9587bf8cd4 Changed option to make masking the default and option to disable it
This will allow a better experience if there is a load balancer
in between and expects websocket frames to be masked.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-29 11:22:22 -07:00
Ivan Kozlovic
2b8c6e0124 Support for Websocket Leafnode connections
Added two options in the remote leaf node configuration

- compress, for websocket only at the moment
- ws_masking, to force remote leafnode connections to mask websocket
frames (default is no masking since it is communication between
server to server)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-28 13:13:11 -07:00
Ivan Kozlovic
131be1cb33 Make TLS client/server handshake helpers function
This reduces code duplication

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-28 13:13:11 -07:00
Ivan Kozlovic
6666f5aa43 [FIXED] LeafNode: save hostname that may be used during TLS handshake
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-26 12:10:57 -07:00
Ivan Kozlovic
c9bba7d1e3 Change back "server_name" to "name" for backward compatibility
The LeafNode connect protocol's Name field had json tag "name" but
was changed to "server_name" in the JetStream cluster branch.
Changing it back to "name" to not have to deal with different
places where to get the name from.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-15 14:00:21 -07:00
Ivan Kozlovic
0d78bce9cf Fixed some leafnode issues introduced from JS cluster work
Also fixed a flapper.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-01-15 12:00:34 -07:00
Derek Collison
f0cdf89c61 JetStream Clustering WIP
Signed-off-by: Derek Collison <derek@nats.io>
2021-01-14 01:14:52 -08:00
Ivan Kozlovic
14aecb2202 Fixed headers support for inbound leafnode connection
The server that solicits a LeafNode connection does not send an
INFO, so the accepting side had no way to know if the remote
supports headers or not. The solicit side will now send the headers
support capability in the CONNECT protocol so that the receiving
side can mark the inbound connection with headers support based
on that and its own support for headers.

Resolves #1781

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-12-21 11:53:24 -07:00
Ivan Kozlovic
fc1521636c [FIXED] Config reload for gateways/leaf remote TLS configurations
Presence of TLS config in any remote gateway or leafnode would
cause the config reload to fail (because TLS config internal
content may change which fails the DeepEqual check).

This PR excludes the TLS configs in such case to check for
changes in gateways and leafnodes.

Although GW and LN config reload is technically supported, this
PR updates the internal remotes' TLS configuration so that
changes/updates to TLS certificates would take effect after
a configuration reload.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-12-11 16:56:25 -07:00
Ivan Kozlovic
406dc7ee56 Fixed data race on leafnode check for remote cluster
A newly introduced test (TestLeafNodeTwoRemotesBindToSameAccount)
had a server creating two remotes to the same server/account.
This test quite often show the data race:
```
go test -race -v -run=TestLeafNodeTwoRemotesBindToSameAccount ./server -count 100 --failfast
=== RUN   TestLeafNodeTwoRemotesBindToSameAccount
==================
WARNING: DATA RACE
Write at 0x00c000168790 by goroutine 34:
  github.com/nats-io/nats-server/v2/server.(*client).processLeafNodeConnect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1177 +0x314
  github.com/nats-io/nats-server/v2/server.(*client).processConnect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:1719 +0x9e4
  github.com/nats-io/nats-server/v2/server.(*client).parse()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/parser.go:870 +0xf88
  github.com/nats-io/nats-server/v2/server.(*client).readLoop()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:1052 +0x7a5
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func4()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0x52

Previous read at 0x00c000168790 by goroutine 32:
  github.com/nats-io/nats-server/v2/server.(*client).remoteCluster()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1203 +0x42d
  github.com/nats-io/nats-server/v2/server.(*Server).updateLeafNodes()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1375 +0x2cf
  github.com/nats-io/nats-server/v2/server.(*client).processLeafSub()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1619 +0x858
  github.com/nats-io/nats-server/v2/server.(*client).parse()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/parser.go:624 +0x5031
  github.com/nats-io/nats-server/v2/server.(*client).readLoop()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:1052 +0x7a5
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func4()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0x52

Goroutine 34 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:2627 +0xc7
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0xf7a
  github.com/nats-io/nats-server/v2/server.(*Server).startLeafNodeAcceptLoop.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:474 +0x5e
  github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1784 +0x57

Goroutine 32 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:2627 +0xc7
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0xf7a
  github.com/nats-io/nats-server/v2/server.(*Server).startLeafNodeAcceptLoop.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:474 +0x5e
  github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1784 +0x57
==================
    testing.go:965: race detected during execution of test
--- FAIL: TestLeafNodeTwoRemotesBindToSameAccount (0.05s)
```

This is because as soon as a LEAF is registered with the account, it is available
in the account's lleafs map, even before the CONNECT for this connectio is processed.
If another LEAF connection is processing a LSUB, the code goes over all leaf connections
for the account and may find the new connection that is in the process of connecting.
The check accesses c.leaf.remoteCluster unlocked which is also set unlocked during
the CONNECT. The fix is to have the set and check on that particular location using
the client's lock.

Ideally I believe that the connection should not have been in the account's lleafs,
or at least not used until the CONNECT for this leaf connection is fully processed.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-11-24 15:42:30 -07:00
Ivan Kozlovic
120b031ffd Merge pull request #1739 from nats-io/leaf-warning
[Added] account name checks for leaf nodes in operator mode
2020-11-24 12:35:31 -07:00
Matthias Hanel
b0461e3921 Fixed comment
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-11-24 12:47:41 -05:00
Matthias Hanel
a0dc9ea3e3 Reducing complexity of lookup
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-11-24 12:31:44 -05:00
Matthias Hanel
a8390b7432 Incorporating comments and moving code
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-11-23 23:27:44 -05:00
Ivan Kozlovic
f155c75da7 [FIXED] LeafNode reject duplicate remote
There was a test to prevent an errorneous loop detection when a
remote would reconnect (due to a stale connection) while the accepting
side did not detect the bad connection yet.

However, this test was racy because the test was done prior to add
the connections to the map.

In the case of a misconfiguration where the remote creates 2 different
remote connections that end-up binding to the same account in the
accepting side, then it was possible that this would not be detected.
And when it was, the remote side would be unaware since the disconnect/
reconnect attempts would not show up if not running in debug mode.

This change makes sure that the detection is no longer racy and returns
an error to the remote so at least the log/console of the remote will
show the "duplicate connection" error messages.

Resolves #1730

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-11-23 13:28:18 -07:00
Ivan Kozlovic
bea9fca24c Prevent panic when accepting TLS leafnode connections
This is an addition to PR #1652. I have simply added a check but
at this point in time there is no risk that connection is closed
this early.
I also renamed the small helper function and fixed a test that
had an improper `s.mu.Unlock()` in an error condition.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-10-21 14:53:03 -06:00
Ivan Kozlovic
9bd088e0b9 Make it a small function
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-10-19 10:50:20 -06:00
Ivan Kozlovic
3b8d00e046 [FIXED] Possible panic when server accepts TLS leafnode connection
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-10-19 10:29:32 -06:00