182 Commits

Author SHA1 Message Date
Ivan Kozlovic
ce96de2ed5 [ADDED] TLS: Handshake First for client connections
A new option instructs the server to perform the TLS handshake first,
that is prior to sending the INFO protocol to the client.

Only clients that implement equivalent option would be able to
connect if the server runs with this option enabled.

The configuration would look something like this:
```
...
tls {
    cert_file: ...
    key_file: ...

    handshake_first: true
}
```

The same option can be set to "auto" or a Go time duration to fallback
to the old behavior. This is intended for deployments where it is known
that not all clients have been upgraded to a client library providing
the TLS handshake first option.

After the delay has elapsed without receiving the TLS handshake from
the client, the server reverts to sending the INFO protocol so that
older clients can connect. Clients that do connect with the "TLS first"
option will be marked as such in the monitoring's Connz page/result.
It will allow the administrator to keep track of applications still
needing to upgrade.

The configuration would be similar to:
```
...
tls {
    cert_file: ...
    key_file: ...

    handshake_first: auto
}
```
With the above value, the fallback delay used by the server is 50ms.

The duration can be explcitly set, say 300 milliseconds:
```
...
tls {
    cert_file: ...
    key_file: ...

    handshake_first: "300ms"
}
```

It is understood that any configuration other that "true" will result
in the server sending the INFO protocol after the elapsed amount of
time without the client initiating the TLS handshake. Therefore, for
administrators that do not want any data transmitted in plain text,
the value must be set to "true" only. It will require applications
to be updated to a library that provides the option, which may or
may not be readily available.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-10-10 09:46:01 -06:00
Derek Collison
dba03dbc2f Optimizations to reduce contention for high connections in a JetStream enabled account with high API usage.
Several strategies which are listed below.

1. Checking a RaftNode to see if it is the leader now uses atomics.
2. Checking if we are the JetStream meta leader from the server now uses an atomic.
3. Accessing the JetStream context no longer requires a server lock, uses atomic.Pointer.
4. Filestore syncBlocks would hold msgBlock locks during sync, now does not.

Signed-off-by: Derek Collison <derek@nats.io>
2023-09-30 14:52:15 -07:00
Neil Twigg
11feadfe7b Add prof_block_rate option for enabling/configuring the block profile
Signed-off-by: Neil Twigg <neil@nats.io>
2023-09-25 21:04:25 +01:00
Neil Twigg
01872d2aa8 Fix empty string constants
Signed-off-by: Neil Twigg <neil@nats.io>
2023-09-19 19:07:17 +01:00
Neil Twigg
8b60131e92 Fix race condition in clientHasMovedToDifferentAccount
Signed-off-by: Neil Twigg <neil@nats.io>
2023-09-19 18:52:34 +01:00
Todd Beets
99dc11551b OCSP Peer Verification 2023-07-19 12:14:21 -07:00
Ivan Kozlovic
7cf00c8ef7 Updates based on PR code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-19 12:16:24 -06:00
Ivan Kozlovic
607b0ca7f3 Fixed cluster permissions configuration reload
This is a rework of incorrect changes made in PR #4001.
This affects only the `dev` branch.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-18 19:02:03 -06:00
Ivan Kozlovic
67498af2dc [ADDED] LeafNode: Support for s2 compression
This is similar to PR #4115 but for LeafNodes.
Compression mode can be set on both side (the accept and in remotes).
```
leafnodes {
   port: 7422
   compression: s2_best
   remotes [
       {
         url: "nats://host2:74222"
         compression: s2_better
       }
   ]
}
```
Possible modes are similar than for routes (described in PR #4115),
except that when not defined we default to `s2_auto`.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-15 17:42:39 -06:00
Ivan Kozlovic
0a02f2121c [ADDED] LeafNode: TLSHandhsakeFirst option
A new field in `tls{}` blocks force the server to do TLS handshake
before sending the INFO protocol.
```
leafnodes {
   port: 7422
   tls {
      cert_file: ...
      ...
      handshake_first: true
   }
   remotes [
       {
         url: tls://host:7423
         tls {
            ...
            handshake_first: true
         }
       }
   ]
}
```
Note that if `handshake_first` is set in the "accept" side, the
first `tls{}` block in the example above, a server trying to
create a LeafNode connection to this server would need to have
`handshake_first` set to true inside the `tls{}` block of
the corresponding remote.

Configuration reload of leafnodes is generally not supported,
but TLS certificates can be reloaded and the support for this
new field was also added.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-01 16:41:51 -06:00
Ivan Kozlovic
5b8c9ee364 Changes based on code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-28 14:34:32 -06:00
Ivan Kozlovic
d6fe9d4c2d [ADDED] Support for route S2 compression
The new field `compression` in the `cluster{}` block allows to
specify which compression mode to use between servers.

It can be simply specified as a boolean or a string for the
simple modes, or as an object for the "s2_auto" mode where
a list of RTT thresholds can be specified.

By default, if no compression field is specified, the server
will use the s2_auto mode with default RTT thresholds of
10ms, 50ms and 100ms for the "uncompressed", "fast", "better"
and "best" modes.

```
cluster {
..
  # Possible values are "disabled", "off", "enabled", "on",
  # "accept", "s2_fast", "s2_better", "s2_best" or "s2_auto"
  compression: s2_fast
}
```

To specify a different list of thresholds for the s2_auto,
here is how it would look like:
```
cluster {
..
  compression: {
    mode: s2_auto
    # This means that for RTT up to 5ms (included), then
    # the compression level will be "uncompressed", then
    # from 5ms+ to 15ms, the mode will switch to "s2_fast",
    # then from 15ms+ to 50ms, the level will switch to
    # "s2_better", and anything above 50ms will result
    # in the "s2_best" compression mode.
    rtt_thresholds: [5ms, 15ms, 50ms]
  }
}
```

Note that the "accept" mode means that a server will accept
compression from a remote and switch to that same compression
mode, but will otherwise not initiate compression. That is,
if 2 servers are configured with "accept", then compression
will actually be "off". If one of the server had say s2_fast
then they would both use this mode.

If a server has compression mode set (other than "off") but
connects to an older server, there will be no compression between
those 2 routes.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-27 17:59:25 -06:00
Derek Collison
a319d24345 Merge branch 'main' into dev 2023-04-13 21:03:05 -07:00
Waldemar Quevedo
a4833d0889 Fix raft log debug reloading
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-04-13 14:57:04 -07:00
Ivan Kozlovic
83c5c0177a Changes based on code review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:28 -06:00
Ivan Kozlovic
fe5d6bede4 Fixed accounts configuration reload
Issues could manifest with subscription interest not properly
propagated.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:28 -06:00
Ivan Kozlovic
bd1b7b8d55 Cleanup use of s.opts and fixed some lock (deadlock/inversion) issues
One should not access s.opts directly but instead use s.getOpts().
Also, server lock needs to be released when performing an account
lookup (since this may result in server lock being acquired).
A function was calling s.LookupAccount under the client lock, which
technically creates a lock inversion situation.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:28 -06:00
Ivan Kozlovic
105237cba8 [ADDED] Multiple routes and ability to have per-account routes
New configuration fields:
```
cluster {
   ...
   pool_size: 5
   accounts: ["A", "B"]
}
```

The configuration `pool_size` in the example above means that this
server will create 5 routes to a remote server, assuming that that
server has the same `pool_size` setting.

Accounts (which are not part of the `accounts[]` configuration)
are assigned a specific route in this pool, and this will be the
same route on all servers in the cluster.

Accounts that are defined in the `accounts` field will each have
a dedicated route connection. This will allow suppression of the
account name in some of the route protocols, reducing bytes transmitted
which may increase performance.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:25 -06:00
Neil Twigg
01a02f2382 Add logtime_utc option 2023-02-10 10:29:26 +00:00
Derek Collison
2daf90493b Authentication and Authorization callouts for server configuration mode.
This adds the ability to augment or override the NATS auth system.

A server will send a signed request to $SYS.REQ.USER.AUTH on the specified account. The request will contain client information, all client options sent to the server, and optionally TLS information and client certificates.
The external auth service will respond with an empty message if not authorized, or a signed User JWT that the user will bind to.

The response can change the account the client will be bound to.

Signed-off-by: Derek Collison <derek@nats.io>
2022-12-28 10:32:45 -08:00
Ivan Kozlovic
2d181e1c27 [FIXED] Routing: TLS connections to discovered server may fail
The server was not setting "server name" in the TLS configuration
for route connections, which may lead to failed (re)connect if
the certificate does not allow for the IP and the URL did not
have the hostname, which would happen with gossip protocol.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-07 17:26:17 -07:00
Ivan Kozlovic
b3e0431959 [FIXED] allow_non_tls is lost after server reload
The server would reset its INFO's TLSRequired to the presence
of a TLS configuration without checking for the allow_non_tls
option.

Resolves #3581

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-27 09:49:44 -06:00
Derek Collison
827b34a77a Add support for AES cipher encryption for filestore.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-15 14:21:37 -07:00
Matthias Hanel
aba1da090b [ADD] account specific in/out msgs/bytes stats to CONNS (#3187)
* [ADD] account specific in/out msgs/bytes stats to CONNS

This subject $SYS.ACCOUNT.%s.SERVER.CONNS will now respond with account
specific datastats for Received and sent messages as well as number of slow
consumers for the account.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-06-28 18:59:29 +02:00
Matthias Hanel
aabaf6f106 [fixed] reload related races (#3222)
account.rm had races caused by reload copying rm from one account to
another

mset.store was used outsisde the lock

in rare cases the stasz message was not received in time.
Trigger automatically now

sometimes a statsz message received before reload cause issues.
try receiving a second time

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-06-28 18:36:13 +02:00
Matthias Hanel
3421c49310 [Add] ability for operator to move streams (#3217)
Also added:
ability to reload tags
special tag (!jetstream) to remove peer from peer placement
$JS.API.SERVER.STREAM.MOVE subject to initiate move away from a server

This changes a detail about regular stream move as well.
Before differing cluster names where used to start/stop a transfer.
Now only the peer list and it's size relative to configured replica matter.
Once a transfer is considered completed, excess peers will be dropped
from the beginning of the list.
This allows transfers within the cluster as well.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-06-28 02:36:32 +02:00
Derek Collison
37f73ab229 Allow users directives for leafnodes to not block reloads.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-20 10:39:37 -07:00
Derek Collison
d24ae4723f Support reload
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-15 07:58:09 -07:00
Ivan Kozlovic
da256ea15a Added consumer_memory_storage to make consumer memory based
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-18 15:53:23 -06:00
Ivan Kozlovic
1ddc5bd9f6 Added consumer_replicas (similar to stream_replicas)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-18 15:53:23 -06:00
Ivan Kozlovic
5d3b1743e3 [ADDED] MQTT: Stream/Consumer replica count override
Ability to override the stream and consumers replica count, which is by default
determined based on the cluster size.

```
mqtt {
  port: 1883
  stream_replicas: 5
  consumer_replicas: 1
}
```

The above would allow *new* MQTT streams to be created with a replicas
factor of 5 (it will be an error if the cluster does not have that
many nodes, and error will occur at runtime when the first client
on a given account connects), and new consumers would be R=1.

The MQTT existing streams/consumers for an account are not modified.

The stream_replicas can also obviously be reduced to 1 for a cluster
of 3 nodes if one desire to have those streams as R=1.

A value of 0 or negative is considered letting the server pick
the value (from 1 to 3 depending on standalone/cluster size).

There is another property that allows the consumers to be created
with memory storage instead of file:
```
mqtt {
  ..
  consumer_memory_storage: true
}
```

Those new settings are global and apply to new streams/consumers
only.

Related to #3116

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>

Update warning

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-18 15:50:23 -06:00
Ivan Kozlovic
730d8921e4 [FIXED] LeafNode: propagation interest issue after a config reload
When a configuration reload is done, the account's leaf node connections
were not transfered to the new instance of the account, causing the
interest to not be propagated until a leafnode reconnect or a server
restart.

Resolves #3009

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-20 08:03:34 -06:00
Derek Collison
5182154cd2 We were not accounting for some newer internal clients (JETSTREAM, ACCOUNT, etc) when reloading authorization, etc.
We were also not copying over local state that has been added over the years to track different types of clients.
We also needed to make sure to reuse the account's internal client and the subscription id (acc.isid).

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-30 19:12:18 -07:00
Matthias Hanel
3933c1f3d8 Fixes #2969, on reload stream import was not removed for js streams
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-30 18:12:57 -04:00
Matthias Hanel
1aeaaf0ca3 Adding server limits (max ack pending/dedupe window) to js config (#2967)
* Adding server limits (max ack pending/dedupe window) to js config

Also shifting consumer config check to jsConsumerCreate as in clustered
mode this was enforced in the wrong place

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-29 13:19:36 -04:00
Matthias Hanel
3e8b66286d Js leaf deny (#2693)
Along a leaf node connection, unless the system account is shared AND the JetStream domain name is identical, the default JetStream traffic (without a domain set) will be denied.

As a consequence, all clients that wants to access a domain that is not the one in the server they are connected to, a domain name must be specified.
Affected from this change are setups where: a leaf node had no local JetStream OR the server the leaf node connected to had no local JetStream. 
One of the two accounts that are connected via a leaf node remote, must have no JetStream enabled.
The side that does not have JetStream enabled, will loose JetStream access and it's clients must set `nats.Domain` manually.

For workarounds on how to restore the old behavior, look at:
https://github.com/nats-io/nats-server/pull/2693#issuecomment-996212582

New config values added:
`default_js_domain` is a mapping from account to domain, settable when JetStream is not enabled in an account.
`extension_hint` are hints for non clustered server to start in clustered mode (and be usable to extend)
`js_domain` is a way to set the JetStream domain to use for mqtt.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-16 16:53:20 -05:00
Derek Collison
b96df068cb Add in max_sub_tokens support
Signed-off-by: Derek Collison <derek@nats.io>
2021-11-04 14:26:01 -07:00
Matthias Hanel
9911b37b0c [added] value to JS stats showing memory used from accounts with reservations
[fixed] reservations accounting issue on reload introduced by:
commit: bfb726e8e9
clearResources appeared to have been a workaround and broke
reload for non global accounts

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-09-21 16:35:24 -04:00
Matthias Hanel
41a253dabb fix daisy chained leaf node subject propagation issue. (#2468)
fixes #2448 

initLeafNodeSmapAndSendSubs did not pick up enough local subscriptions.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-08-25 18:10:09 -04:00
Derek Collison
925a6fe6b2 Fix for #2388. Leafnodes with no JS can seamlessly access a HUB with JS.
This is the reverse of the early work to have LNs extend a non-JS cluster.
Also have mixed mode tests as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-01 14:57:47 -07:00
0e04effaed Adds public ReloadOptions api support
Refactor Reload to call ReloadOptions
2021-07-07 09:57:35 -07:00
Waldemar Quevedo
60499e2749 ocsp: add more config options to customize ocsp stapling
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2021-06-10 10:48:51 -07:00
Ivan Kozlovic
c45f4f0353 [FIXED] LeafNode config reload failed without any change made
Issuing a configuration reload for a leafnode that has remotes
defined with remotes having more than 1 url could lead to a failure.
This is because we have introduced shuffling of remote urls but
that was done in the server's options object, which then would
cause the DeepEqual when diff'ing options to fail.
We move the suffling to the private list of urls.

The other issue was that the "old" remote option may not have
had a local account and it was not set to "$G", which could make
the DeepEqual fail.

Resolves #2273

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-06-09 16:12:39 -06:00
Jaime Piña
6c992199ae ocsp: Add OCSP Stapling support for cluster, gateway and leafnodes
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
Signed-off-by: Jaime Piña <jaime@synadia.com>
2021-06-08 16:53:42 -07:00
Waldemar Quevedo
f89d06190c Merge pull request #2240 from nats-io/ocsp-caching
OCSP Stapling
2021-05-26 15:21:14 -07:00
Waldemar Quevedo
d78a91836b ocsp: Add caching staples to disk to store dir
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2021-05-26 15:04:05 -07:00
Matthias Hanel
b1dee292e6 [changed] pinned certs to check the server connected to as well (#2247)
* [changed] pinned certs to check the server connected to as well

on reload clients with removed pinned certs will be disconnected.
The check happens only on tls handshake now.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-05-24 17:28:32 -04:00
Jaime Piña
b2e1ff7a7c Add OCSP support
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2021-05-24 10:52:27 -07:00
Matthias Hanel
6f6f22e9a7 [added] pinned_cert option to tls block hex(sha256(spki)) (#2233)
* [added] pinned_cert option to tls block hex(sha256(spki))

When read form config, the values are automatically lower cased.
The check when seeing the values programmatically requires 
lower case to avoid having to alter the map at this point.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-05-20 17:00:09 -04:00
Matthias Hanel
2664e964a8 [fixed] issue with concurrent account fetch when account was incomplete (#2067)
* [fixed] issue with concurrent account fetch when account was incomplete

This happened when a dummy (expired/incomplete) account was created during
a route operation. The dummy was to avoid fetching the account, which would
cause a lock inversion.
When a non route request required the account, we'd download it as it is
set to expired.
A concurrent request would result in ErrAccountResolverSameClaims which
the caller did not handle.
Fix is to remove ErrAccountResolverSameClaims.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-06 12:43:10 -04:00