Commit Graph

571 Commits

Author SHA1 Message Date
Derek Collison
4175e4ee9c Merge branch 'main' into dev 2023-05-06 09:55:34 -07:00
Derek Collison
80db7a22ab Optimizations for large single hub account leafnode fleets.
Added a leafnode lock to allow better traversal without copying of large leafnodes in a single hub account.

Signed-off-by: Derek Collison <derek@nats.io>
2023-05-05 13:14:49 -07:00
Ivan Kozlovic
8a4ead22bc Updates based on code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-03 16:14:51 -06:00
Ivan Kozlovic
95e4f2dfe1 Fixed accounts configuration reload
Issues could manifest with subscription interest not properly
propagated.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-03 14:35:06 -06:00
Ivan Kozlovic
840c264f45 Cleanup use of s.opts and fixed some lock (deadlock/inversion) issues
One should not access s.opts directly but instead use s.getOpts().
Also, server lock needs to be released when performing an account
lookup (since this may result in server lock being acquired).
A function was calling s.LookupAccount under the client lock, which
technically creates a lock inversion situation.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-03 14:09:02 -06:00
Derek Collison
0321eb6484 Merge branch 'main' into dev 2023-04-29 19:52:57 -07:00
Ivan Kozlovic
349f01e86a Change the absence of compression setting to default to "accept"
In that mode, a server accepts and will switch to same compression
level than the remote (if one is set) but will not initiate compression.
So if all servers in a cluster do not have compression setting set,
it defaults to "accept" which means that compression is "off".

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-28 15:33:17 -06:00
Ivan Kozlovic
5b8c9ee364 Changes based on code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-28 14:34:32 -06:00
Ivan Kozlovic
d6fe9d4c2d [ADDED] Support for route S2 compression
The new field `compression` in the `cluster{}` block allows to
specify which compression mode to use between servers.

It can be simply specified as a boolean or a string for the
simple modes, or as an object for the "s2_auto" mode where
a list of RTT thresholds can be specified.

By default, if no compression field is specified, the server
will use the s2_auto mode with default RTT thresholds of
10ms, 50ms and 100ms for the "uncompressed", "fast", "better"
and "best" modes.

```
cluster {
..
  # Possible values are "disabled", "off", "enabled", "on",
  # "accept", "s2_fast", "s2_better", "s2_best" or "s2_auto"
  compression: s2_fast
}
```

To specify a different list of thresholds for the s2_auto,
here is how it would look like:
```
cluster {
..
  compression: {
    mode: s2_auto
    # This means that for RTT up to 5ms (included), then
    # the compression level will be "uncompressed", then
    # from 5ms+ to 15ms, the mode will switch to "s2_fast",
    # then from 15ms+ to 50ms, the level will switch to
    # "s2_better", and anything above 50ms will result
    # in the "s2_best" compression mode.
    rtt_thresholds: [5ms, 15ms, 50ms]
  }
}
```

Note that the "accept" mode means that a server will accept
compression from a remote and switch to that same compression
mode, but will otherwise not initiate compression. That is,
if 2 servers are configured with "accept", then compression
will actually be "off". If one of the server had say s2_fast
then they would both use this mode.

If a server has compression mode set (other than "off") but
connects to an older server, there will be no compression between
those 2 routes.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-27 17:59:25 -06:00
Derek Collison
a66ac8cb9b The server's Start() used to block but no longer does. This updates tests and function comment.
Fix for #4110

Signed-off-by: Derek Collison <derek@nats.io>
2023-04-27 06:55:03 -07:00
Derek Collison
09afcee9d9 Merge branch 'main' into dev 2023-04-17 08:43:08 -07:00
Derek Collison
9a3e0b783c Fix for a data race when setting up service import subscriptions.
Signed-off-by: Derek Collison <derek@nats.io>
2023-04-17 06:40:09 -07:00
Derek Collison
03d5cf0871 Merge branch 'main' into dev 2023-04-04 15:14:41 -07:00
Derek Collison
9f95e993e2 Do not inline inbound system messages to avoid blocking routes, gateways etc.
Signed-off-by: Derek Collison <derek@nats.io>
2023-04-04 13:53:21 -07:00
Ivan Kozlovic
83c5c0177a Changes based on code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:28 -06:00
Ivan Kozlovic
fe5d6bede4 Fixed accounts configuration reload
Issues could manifest with subscription interest not properly
propagated.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:28 -06:00
Ivan Kozlovic
bd1b7b8d55 Cleanup use of s.opts and fixed some lock (deadlock/inversion) issues
One should not access s.opts directly but instead use s.getOpts().
Also, server lock needs to be released when performing an account
lookup (since this may result in server lock being acquired).
A function was calling s.LookupAccount under the client lock, which
technically creates a lock inversion situation.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:28 -06:00
Ivan Kozlovic
105237cba8 [ADDED] Multiple routes and ability to have per-account routes
New configuration fields:
```
cluster {
   ...
   pool_size: 5
   accounts: ["A", "B"]
}
```

The configuration `pool_size` in the example above means that this
server will create 5 routes to a remote server, assuming that that
server has the same `pool_size` setting.

Accounts (which are not part of the `accounts[]` configuration)
are assigned a specific route in this pool, and this will be the
same route on all servers in the cluster.

Accounts that are defined in the `accounts` field will each have
a dedicated route connection. This will allow suppression of the
account name in some of the route protocols, reducing bytes transmitted
which may increase performance.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:25 -06:00
Derek Collison
48a5d270b2 Merge branch 'main' into dev 2023-04-02 04:23:52 -07:00
Derek Collison
e54019f87f All should be lowercase
Signed-off-by: Derek Collison <derek@nats.io>
2023-04-02 03:53:01 -07:00
Derek Collison
7ce23c46ce Merge branch 'main' into dev 2023-02-21 08:34:08 -08:00
Neil Twigg
68961ffedd Refactor ipQueue to use generics, reduce allocations 2023-02-21 14:50:09 +00:00
Derek Collison
3c889478bd Merge pull request #3719 from nats-io/auth_callout
Authorization Callouts
2023-01-03 15:34:10 -08:00
Derek Collison
ff79afef39 Merge branch 'main' into dev 2022-12-30 12:23:23 -08:00
Neil Twigg
14d0ba1c65 Fix some lint errors after move to golangci-lint 2022-12-30 20:00:08 +00:00
Derek Collison
2daf90493b Authentication and Authorization callouts for server configuration mode.
This adds the ability to augment or override the NATS auth system.

A server will send a signed request to $SYS.REQ.USER.AUTH on the specified account. The request will contain client information, all client options sent to the server, and optionally TLS information and client certificates.
The external auth service will respond with an empty message if not authorized, or a signed User JWT that the user will bind to.

The response can change the account the client will be bound to.

Signed-off-by: Derek Collison <derek@nats.io>
2022-12-28 10:32:45 -08:00
Derek Collison
5738eeb089 Merge branch 'main' into dev 2022-11-25 11:10:01 -08:00
Derek Collison
a5814dad1f In operator mode do not set a no_auth_user for $G.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-25 10:20:35 -08:00
Derek Collison
3351213555 Merge branch 'main' into dev 2022-11-21 22:22:45 -08:00
Derek Collison
06bab2c4de If no_auth_user is set, clear auth required for server info.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-21 20:26:54 -08:00
Todd Beets
571a375f2e add unique_tag to JetStreamConfig for varz/jsz visibility 2022-11-11 13:36:13 -08:00
Ivan Kozlovic
3ec42d5b85 Updates to PR #3611
- Save the TLS name only if not already set
- Use the passed URLs slice instead of using s.getOpts().Routes
- Enhanced the test
- Fixed an unrelated DATA RACE report

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-08 09:36:08 -07:00
Ivan Kozlovic
2d181e1c27 [FIXED] Routing: TLS connections to discovered server may fail
The server was not setting "server name" in the TLS configuration
for route connections, which may lead to failed (re)connect if
the certificate does not allow for the IP and the URL did not
have the hostname, which would happen with gossip protocol.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-07 17:26:17 -07:00
Derek Collison
15dc72db50 Removed migration of ephemerals, added proper signaling for pul consumers pending requests.
Signed-off-by: Derek Collison <derek@nats.io>
2022-10-25 14:35:20 -07:00
Ivan Kozlovic
bec51ed52b [FIXED] JetStream: User given named ephemeral lost after migration
If an ephemeral was given a name by the user, if the consumer leader
was then shutdown, the ephemeral would be migrated using a server
generated new name instead of keeping the user given name.

Also, in some cases the migration would not even occur. This was
likely due to the fact that RAFT node(s) were shutdown prior to
the ephemeral migration code was invoked.

Resolves #3550

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-14 15:20:45 -06:00
Ivan Kozlovic
170ff49837 [ADDED] JetStream: peer (the hash of server name) in statsz/jsz
A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something
like this:
```
...
    "meta_cluster": {
      "name": "local",
      "leader": "A",
      "peer": "NUmM6cRx",
      "replicas": [
        {
          "name": "B",
          "current": true,
          "active": 690369000,
          "peer": "b2oh2L6w"
        },
        {
          "name": "Server name unknown at this time (peerID: jZ6RvVRH)",
          "current": false,
          "offline": true,
          "active": 0,
          "peer": "jZ6RvVRH"
        }
      ],
      "cluster_size": 3
    }
```
Note the "peer" field following the "leader" field that contains
the server name. The new field is the node ID, which is a hash of
the server name.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-16 15:31:37 -06:00
Matthias Schneider
a58a7bf1ec server: expire display Never instead of 1970 2022-09-08 10:03:04 +02:00
Derek Collison
56e177c329 Allow stream msgs to be compressed within the raft layer and during upper layer catchups.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-30 16:10:57 -07:00
Ivan Kozlovic
9a6a2c31ee [ADDED] JetStream: Ability to configure the per server max catchup bytes
The original value was hardcoded to 128MB and 32MB per stream. The
per-server limit is lowered to 32MB but is configurable with
a new configuration parameter:
```
jetstream {
   max_catchup: 8MB
}
```

The per-stream limit was also lowered from 32MB/128,000msgs to
8MB/32,000 messages.

Tests have shown no difference in performance for fast links.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-30 13:46:13 -06:00
Ivan Kozlovic
284e35132b Merge pull request #3387 from nats-io/fix_3317
[ADDED] Monitoring: TLS Peer Certificates in Connz when auth is on
2022-08-24 14:28:01 -06:00
Ivan Kozlovic
951b7c38f6 [ADDED] Monitoring: TLS Peer Certificates in Connz when auth is on
Add basic peer certificates information in /connz endpoint when
the "auth" option is provided.

Resolves #3317

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-22 11:48:49 -06:00
Ivan Kozlovic
f6c4e5fcee [CHANGED] Gateway: Switch all accounts to interest-only mode
We are phasing out the optimistic-only mode. Servers accepting
inbound gateway connections will switch the accounts to interest-only
mode.

The servers with outbound gateway connection will check interest
and ignore the "optimistic" mode if it is known that the corresponding
inbound is going to switch the account to interest-only. This is
done using a boolean in the gateway INFO protocol.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-19 16:41:44 -06:00
Ivan Kozlovic
7de4497815 Install consumer snapshot on clean exit and few other fixes
- didRemove in applyMetaEntries() could be reset when processing
multiple entries
- change "no race" test names to include JetStream
- separate raft nodes leader stepdown and stop in server
shutdown process
- in InstallSnapshot, call wal.Compact() with lastIndex+1

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-16 17:05:49 -06:00
Ivan Kozlovic
3c9a7cc6e5 Move to Go 1.19, remote io/util, fix data race and a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 09:55:37 -06:00
Ivan Kozlovic
5bc03c7637 Update leafNodeEnabled value on Start()
Maybe that is the place it could be set and not in NewServer(), but
want to minimize risk of breaking something close to 2.9.0

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-04 14:13:35 -06:00
Ivan Kozlovic
d84d9f8288 Use specific boolean for a leaf test instead of using leafNodeEnabled
A test TestJetStreamClusterLeafNodeSPOFMigrateLeaders was added at
some point that needed the remotes to stop (re)connecting. It made
use of existing leafNodeEnabled that was used for GW/Leaf interest
propagation races to disable the reconnect, but that may not be
the best approach since it could affect users embedding servers
and adding leafnodes "dynamically".

So this PR introduced a specific boolean specific for that test.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-04 10:00:11 -06:00
Matthias Hanel
ed2cb280cc Move to crypto/rand for nonce generation (#3324)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-08-03 06:52:26 +02:00
Matthias Hanel
d53d2d0484 [Added] account specific monitoring endpoint(s) (#3250)
Added http monitoring endpoint /accstatz
It responds with a list of statz for all accounts with local connections
the argument "unused=1" can be provided to get statz for all accounts
This endpoint is also exposed as nats request under:

This monitoring endpoint is exposed via the system account.
$SYS.REQ.ACCOUNT.*.STATZ
Each server will respond with connection statistics for the requested
account. The format of the data section is a list (size 1) identical to the event
$SYS.ACCOUNT.%s.SERVER.CONNS which is sent periodically as well as on
connect/disconnect. Unless requested by options, server without the account,
or server where the account has no local connections, will not respond.

A PING endpoint exists as well. The response format is identical to
$SYS.REQ.ACCOUNT.*.STATZ
(however the data section will contain more than one account, if they exist)
In addition to general filter options the request takes a list of accounts and
an argument to include accounts without local connections (disabled by default)
$SYS.REQ.ACCOUNT.PING.STATZ

Each account has a new system account import where the local subject
$SYS.REQ.ACCOUNT.PING.STATZ essentially responds as if
the importing account name was used for $SYS.REQ.ACCOUNT.*.STATZ

The only difference between requesting ACCOUNT.PING.STATZ from within
the system account and an account is that the later can only retrieve
statz for the account the client requests from.

Also exposed the monitoring /healthz via the system account under
$SYS.REQ.SERVER.*.HEALTHZ
$SYS.REQ.SERVER.PING.HEALTHZ
No dedicated options are available for these.
HEALTHZ also accept general filter options.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-07-12 21:50:32 +02:00
Ivan Kozlovic
5ed212ce04 Rework startupComplete gate from PR #2360
The "InProcess" change make readyForConnections() possibly return
much faster than it used to, which could cause tests to fail.

Restore the original behavior, but in case of DontListen option
wait on the startupComplete gate.

Also fixed some missing checks for leafnode connections.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-06-28 17:36:39 -06:00
Neil Alexander
558293e096 Fix the lock 2022-06-28 18:05:57 +01:00