Commit Graph

174 Commits

Author SHA1 Message Date
Derek Collison
353d543c16 When a queue subscriber was updated multiple times over a leafnode connection we added in more shadow subscriptions which could become zombies when the connection went away.
In a case where a leafnode server had multiple queue subscribers on the same queue group, the hub server would add in multiple shadow subs. These subs would not be properly cleaned up and could lead to stale connections being associated with them.

Signed-off-by: Derek Collison <derek@nats.io>
2023-07-10 21:03:47 -07:00
Artem Seleznev
27a8b96ee3 different panic fixes
Signed-off-by: Artem Seleznev <seleznyov.artyom@gmail.com>
2023-06-02 13:19:22 +03:00
Derek Collison
80db7a22ab Optimizations for large single hub account leafnode fleets.
Added a leafnode lock to allow better traversal without copying of large leafnodes in a single hub account.

Signed-off-by: Derek Collison <derek@nats.io>
2023-05-05 13:14:49 -07:00
Ivan Kozlovic
840c264f45 Cleanup use of s.opts and fixed some lock (deadlock/inversion) issues
One should not access s.opts directly but instead use s.getOpts().
Also, server lock needs to be released when performing an account
lookup (since this may result in server lock being acquired).
A function was calling s.LookupAccount under the client lock, which
technically creates a lock inversion situation.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-03 14:09:02 -06:00
Derek Collison
c15cc0054a When a fleet of leafnodes are isolated (not routed but using same cluster) we could do better at optimizing how we update the other leafnodes.
Signed-off-by: Derek Collison <derek@nats.io>
2023-04-30 17:08:16 -07:00
Jeremy Saenz
26f241cb62 Updated LEAFZ names to use remoteServer name/id and added is_spoke 2023-02-28 18:09:24 -08:00
Jeremy Saenz
9d4a603aaf Update LEAFZ to include leafnode server/connection name 2023-02-28 14:20:18 -08:00
Derek Collison
ad53d455f8 When migrating leaders off a server when the leafnode is not connected, also ensure leaders can not return until reconnected.
Signed-off-by: Derek Collison <derek@nats.io>
2023-01-05 08:02:50 -08:00
Ivan Kozlovic
91c84c03c2 [FIXED] LeafNode: possible duplicate messages in complex setup
This is specific to setup described [here](https://github.com/nats-io/nats-server/issues/3191#issuecomment-1296974382)
and does not require JetStream to be reproduced. The added test
reproduces the above setup but without JetStream enabled in
the accounts.

Each cluster has a leafnode for a given account to the other
cluster. The accounts import/export a subject. When a consumer
is connected to cluster "B" and the producer is on cluster "A"
there was a duplicate message. Due to shadow subscription caused
by the import/export rules, an additional subscription was
sent across the leafnode.

Resolves #3191

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-03 12:34:01 -06:00
Derek Collison
9c5ae6baef Existing subs would be sent to leafnodes even though pub perms should disallow.
If the LS+ gets through we debug that it was denied, but also fixed it so that does not happen.

Signed-off-by: Derek Collison <derek@nats.io>
2022-10-27 12:31:57 -07:00
Ivan Kozlovic
d90854a45f Merge pull request #3341 from nats-io/go_1_19
Move to Go 1.19, remote io/util, fix data race and a flapper
2022-08-05 12:49:06 -06:00
Ivan Kozlovic
3c9a7cc6e5 Move to Go 1.19, remote io/util, fix data race and a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 09:55:37 -06:00
Ivan Kozlovic
7baf7bd887 [ADDED] LeafNode: Support for a SignatureHandler in remote config
This would allow in embedded use-cases where the user does not
have the ability to use a credentials file. Instead, a signature
callback is specified and invoked by the server sends the CONNECT
protocol. The user is responsible to provide the JWT and sign the
nonce.

Resolves #3331

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-04 16:59:09 -06:00
Ivan Kozlovic
d84d9f8288 Use specific boolean for a leaf test instead of using leafNodeEnabled
A test TestJetStreamClusterLeafNodeSPOFMigrateLeaders was added at
some point that needed the remotes to stop (re)connecting. It made
use of existing leafNodeEnabled that was used for GW/Leaf interest
propagation races to disable the reconnect, but that may not be
the best approach since it could affect users embedding servers
and adding leafnodes "dynamically".

So this PR introduced a specific boolean specific for that test.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-04 10:00:11 -06:00
Derek Collison
28ccaa4371 Direct get across a leafnode using cross domain mappings to a queue subscriber did not work.
The interest moved across the leafnode would be for the mapping, and not the actual qsub.
So when received if we did detect that we are mapped and do not have a queue filter present make sure to ignore.
This will allow queue subscriber processing on the local server that received the message from the leafnode.

Signed-off-by: Derek Collison <derek@nats.io>
2022-08-03 20:21:28 -07:00
Derek Collison
c14fda51e7 Direct access to JetStream resources would be affected if across a leafnode that was down.
This allows a solciting leafnode config to ask that any JetStream cluster assets that are a current leader have the leader stepdown.

Signed-off-by: Derek Collison <derek@nats.io>
2022-07-05 12:35:09 -07:00
Derek Collison
e6479dafd2 Close leafnode connection when same cluster name detected
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-30 15:34:22 -07:00
Ivan Kozlovic
b5c9583ee2 Reject configuration with value below 2.8.0
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-07 12:49:34 -06:00
Ivan Kozlovic
7fa2676353 Fixed comment typos and some rewording
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-07 09:22:51 -06:00
Ivan Kozlovic
9e6f965913 [ADDED] LeafNode min_version new option
If set, a server configured to accept leafnode connections will
reject a remote server whose version is below that value. Note
that servers prior to v2.8.0 are not sending their version
in the CONNECT protocol, which means that anything below 2.8.0
would be rejected.

Configuration example:
```
leafnodes {
    port: 7422
    min_version: 2.8.0
}
```
The option is a string and can have the "v" prefix:
```
min_version: "v2.9.1"
```
Note that although suffix such as `-beta` would be accepted,
only the major, minor and update are used for the version comparison.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-06 18:40:33 -06:00
Matthias Hanel
9a2da9ed8c Adding denies $KV.>/$OBJ.> along leaf connections on differing domain (#2916)
* Adding denies $KV.>/$OBJ.> along leaf connections on differing domain

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-09 13:17:59 -05:00
Ivan Kozlovic
85b3f8a7fd Gateways: data race when setting first ping timer
This was introduced when fixing #2881. The call to setFirstPingTimer
needed to be done under the client's lock.

Moved setFirstPingTimer from a server receiver to a client receiver.
The only reason it was a server receiver is because we need the
server options, but c.srv is always set when invoking this function,
so we will get the server from c.srv in that function now.

Related to #2881

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-04 19:55:07 -07:00
Ivan Kozlovic
1e53d81cb3 [FIXED] LeafNode: queue sub interest not properly sent to new LN
In complex situations, queue members count across various servers
may not be properly accounted for when sent to a new leafnode
connection.

The new test TestLeafNodeQueueGroupWithLateLNJoin has a drawing
of such setup, when after LN1 joined, and then queue members
were removed with 1 left, LN1 was told that there was no
more interest, so message published to LN1 would not reach
the remaining queue sub connected to LN2.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-04 17:03:06 -07:00
Matthias Hanel
3e8b66286d Js leaf deny (#2693)
Along a leaf node connection, unless the system account is shared AND the JetStream domain name is identical, the default JetStream traffic (without a domain set) will be denied.

As a consequence, all clients that wants to access a domain that is not the one in the server they are connected to, a domain name must be specified.
Affected from this change are setups where: a leaf node had no local JetStream OR the server the leaf node connected to had no local JetStream. 
One of the two accounts that are connected via a leaf node remote, must have no JetStream enabled.
The side that does not have JetStream enabled, will loose JetStream access and it's clients must set `nats.Domain` manually.

For workarounds on how to restore the old behavior, look at:
https://github.com/nats-io/nats-server/pull/2693#issuecomment-996212582

New config values added:
`default_js_domain` is a mapping from account to domain, settable when JetStream is not enabled in an account.
`extension_hint` are hints for non clustered server to start in clustered mode (and be usable to extend)
`js_domain` is a way to set the JetStream domain to use for mqtt.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-16 16:53:20 -05:00
Matthias Hanel
581dfb27d0 hitting an account limit left an outgoing leaf node conn in bad state (#2715)
since no error was traced or the connection closed, subscriptions where
not forwarded

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-11-30 17:48:07 -05:00
Phil Pennock
fc6df0fbbc Redact URLs before logging or returning in error (#2643)
* Redact URLs before logging or returning in error

This does not affect strings which failed to parse, and in such a scenario
there's a mix of "which evil" to accept; we can't sanely find what should be
redacted in those cases, so we leave them alone for debugging.

The JWT library returns some errors for Operator URLs, but it rejects URLs
which contain userinfo, so there can't be passwords in those and they're safe.

Fixes #2597

* Test the URL redaction auxiliary functions

* End-to-end tests for secrets in debug/trace

Create internal/testhelper and move DummyLogger there, so it can be used from
the test/ sub-dir too.

Let DummyLogger optionally accumulate all log messages, not just retain the
last-seen message.

Confirm no passwords logged by TestLeafNodeBasicAuthFailover.

Change TestNoPasswordsFromConnectTrace to check all trace messages, not just the
most recent.

Validate existing trace redaction in TestRouteToSelf.

* Test for password in solicited route reconnect debug
2021-10-27 12:44:59 -04:00
Derek Collison
ebc581012e Fixed a condition where JetStream assets could be created in multiple leafnodes.
Also added in optional Domain to StreamInfo.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-14 14:17:49 -07:00
Derek Collison
c841269f71 Make sure to suppress dupes on JS deny all for system account
Signed-off-by: Derek Collison <derek@nats.io>
2021-09-08 17:09:25 -07:00
Matthias Hanel
41a253dabb fix daisy chained leaf node subject propagation issue. (#2468)
fixes #2448 

initLeafNodeSmapAndSendSubs did not pick up enough local subscriptions.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-08-25 18:10:09 -04:00
Derek Collison
da577e2065 Added ability for leaafnodes to allow broader subscriptions to pass through and not cause disconnects.
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-25 11:00:01 -07:00
Derek Collison
6294c0b0c7 Fixes for reversing perms on the hub side of a leafnode.
Note since the hub will disconnect currently on a subscription from a soliciting leaf, we still do strict checks there.
We always properly check if data can flow, so we could remove the sub checks all together.

I did look into ways of returning a scoped subject for explicit allow subscriptions when presented with a wildcard, however this would have meant resolving multiple items.
E.g. allow ['foo', 'bar', 'foo.bar']
 With a sub of '*' that would have to expand to ['foo', 'bar']
 With a sub of '>' that would have to expand to ['foo', 'bar', "foo.bar']
 With a sub of 'foo.*' that would have to expand to ['foo.bar']

I may sleep on this and revisit if I think I can get it to work properly.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-24 20:30:34 -07:00
David Laban
1d5cc21c3c make error actionable when adding operator+leafnodes
There are many examples in the documentation for one half of this configuration or the other,
but none which configure a leafnode remote on an operator-authenticated cluster.

The error "operator mode requires account nkeys in remotes." is not very clear or actionable.
2021-08-18 18:07:53 +01:00
Ivan Kozlovic
d7a124baaf [FIXED] LeafNode with "wss://.." url was not always initiating TLS
If the remote did not have any TLS configuration, the URL scheme
"wss://" was not used as the indicating that the connection should
be attempted as a TLS connection, causing "invalid websocket connection"
in the server attempting to create the remote leafnode connection.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-08-15 12:39:10 -06:00
Matthias Hanel
db9fd45be2 [fixed] issue where js overwrote leafnode remotes permissions from creds
Fixes #2415. We did a set instead of merge.
changes in `jwt_test.go` are to make the `createUserWithLimit` usable by my new test.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-08-12 12:57:50 -04:00
Derek Collison
09012f0847 Fix for #2401.
Leafnodes with same domain and shared system account should behave like flat jetstream network.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-04 07:07:29 -07:00
Derek Collison
925a6fe6b2 Fix for #2388. Leafnodes with no JS can seamlessly access a HUB with JS.
This is the reverse of the early work to have LNs extend a non-JS cluster.
Also have mixed mode tests as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-01 14:57:47 -07:00
Matthias Hanel
a40ea298e5 [fixed] jetstream unique server name requirement across domains (#2378)
* [fixed] jetstream unique server name requirement across domains

including domain in server info
adding check for cluster name in duplicate leaf node connection check

This does not address non unique domains in the same domain, say within
super cluster.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-07-27 18:42:19 -04:00
Matthias Hanel
9f6ba90d3e [fixing] leafnode missing retry and service export interest propagation (#2288)
* [fixing] leafnode missing retry and service export interest propagation

A missing account on initial connect attempt caused the leaf node
connection to never be established.

An account service import subscription was not propagated along leaf
node connections.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-06-17 19:10:05 -04:00
Matthias Hanel
83389db226 [fixed] hanging leaf node connection when account can't be found (#2267)
* [fixed] hanging leaf node connection when account can't be found

as a result of the issue, the leaf node connection never got created,
even after the account can be found.

Also tracing account id and name (when available)

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-06-10 11:55:16 -04:00
Ivan Kozlovic
c45f4f0353 [FIXED] LeafNode config reload failed without any change made
Issuing a configuration reload for a leafnode that has remotes
defined with remotes having more than 1 url could lead to a failure.
This is because we have introduced shuffling of remote urls but
that was done in the server's options object, which then would
cause the DeepEqual when diff'ing options to fail.
We move the suffling to the private list of urls.

The other issue was that the "old" remote option may not have
had a local account and it was not set to "$G", which could make
the DeepEqual fail.

Resolves #2273

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-06-09 16:12:39 -06:00
Derek Collison
30fae4f960 Changes to leafnodes to support multiple domains where the hub is JetStream enabled but the hub account is not, and the leafnode is.
We were incorrectly shutting things down via deny clauses when detecting the remote side/hub had JetStream capabilities.
This change moves that logic to the remote side and is signalled off the connect message which let's the remote side know
if the local leafnode account has JetStream enabled.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-07 08:39:11 -07:00
Matthias Hanel
b1dee292e6 [changed] pinned certs to check the server connected to as well (#2247)
* [changed] pinned certs to check the server connected to as well

on reload clients with removed pinned certs will be disconnected.
The check happens only on tls handshake now.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-05-24 17:28:32 -04:00
Matthias Hanel
6f6f22e9a7 [added] pinned_cert option to tls block hex(sha256(spki)) (#2233)
* [added] pinned_cert option to tls block hex(sha256(spki))

When read form config, the values are automatically lower cased.
The check when seeing the values programmatically requires 
lower case to avoid having to alter the map at this point.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-05-20 17:00:09 -04:00
Derek Collison
ea5cddd590 Moved the JetStream logic for solicited leafnodes to after we receive first info.
We needed access to the other side's JetStream status.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:46:32 -06:00
Derek Collison
8499376575 Add in support for JetStream domains.
This allows a domain to be set in the JetStream server block that sets a domain name.
Once set this signals that any leafnode connections should operate as separate JetStream domains.
Each domain <NAME> is accessible via "$JS.<NAME>.API.>", even when connected to the same domain.
Also for mixed mode you can set a jetstream block now that defines a domain but specifies "enabled: false".

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:46:32 -06:00
Derek Collison
bfd0e00271 Fix data race
Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:45:27 -06:00
Derek Collison
df664e780e Rework auto insertion of deny exports and imports for leafnodes.
This shifts to runtime vs setup time.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:45:27 -06:00
Derek Collison
0bd92e85da Add in formal support for multiple JetStream domains across leafnodes.
This CL adds in support for multiple JetStream domains using mapped subjects.
Mapping subjects aligns well with the JetStream context APIPrefix in clients.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:45:27 -06:00
Derek Collison
a9c591533c Move to Info vs Warn
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-30 15:39:09 -07:00
Derek Collison
ba31bb6165 When detecting a jetStream domain that is extended to a leafnode or leafnode cluster
we want to auto-suppress JetStream traffic on normal accounts.

We also now track remote accounts so that client info headers can be remapped.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-30 15:23:12 -07:00