Commit Graph

545 Commits

Author SHA1 Message Date
Derek Collison
e54019f87f All should be lowercase
Signed-off-by: Derek Collison <derek@nats.io>
2023-04-02 03:53:01 -07:00
Neil Twigg
68961ffedd Refactor ipQueue to use generics, reduce allocations 2023-02-21 14:50:09 +00:00
Neil Twigg
14d0ba1c65 Fix some lint errors after move to golangci-lint 2022-12-30 20:00:08 +00:00
Derek Collison
a5814dad1f In operator mode do not set a no_auth_user for $G.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-25 10:20:35 -08:00
Derek Collison
06bab2c4de If no_auth_user is set, clear auth required for server info.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-21 20:26:54 -08:00
Ivan Kozlovic
3ec42d5b85 Updates to PR #3611
- Save the TLS name only if not already set
- Use the passed URLs slice instead of using s.getOpts().Routes
- Enhanced the test
- Fixed an unrelated DATA RACE report

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-08 09:36:08 -07:00
Ivan Kozlovic
2d181e1c27 [FIXED] Routing: TLS connections to discovered server may fail
The server was not setting "server name" in the TLS configuration
for route connections, which may lead to failed (re)connect if
the certificate does not allow for the IP and the URL did not
have the hostname, which would happen with gossip protocol.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-07 17:26:17 -07:00
Derek Collison
15dc72db50 Removed migration of ephemerals, added proper signaling for pul consumers pending requests.
Signed-off-by: Derek Collison <derek@nats.io>
2022-10-25 14:35:20 -07:00
Ivan Kozlovic
bec51ed52b [FIXED] JetStream: User given named ephemeral lost after migration
If an ephemeral was given a name by the user, if the consumer leader
was then shutdown, the ephemeral would be migrated using a server
generated new name instead of keeping the user given name.

Also, in some cases the migration would not even occur. This was
likely due to the fact that RAFT node(s) were shutdown prior to
the ephemeral migration code was invoked.

Resolves #3550

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-14 15:20:45 -06:00
Ivan Kozlovic
170ff49837 [ADDED] JetStream: peer (the hash of server name) in statsz/jsz
A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something
like this:
```
...
    "meta_cluster": {
      "name": "local",
      "leader": "A",
      "peer": "NUmM6cRx",
      "replicas": [
        {
          "name": "B",
          "current": true,
          "active": 690369000,
          "peer": "b2oh2L6w"
        },
        {
          "name": "Server name unknown at this time (peerID: jZ6RvVRH)",
          "current": false,
          "offline": true,
          "active": 0,
          "peer": "jZ6RvVRH"
        }
      ],
      "cluster_size": 3
    }
```
Note the "peer" field following the "leader" field that contains
the server name. The new field is the node ID, which is a hash of
the server name.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-16 15:31:37 -06:00
Matthias Schneider
a58a7bf1ec server: expire display Never instead of 1970 2022-09-08 10:03:04 +02:00
Derek Collison
56e177c329 Allow stream msgs to be compressed within the raft layer and during upper layer catchups.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-30 16:10:57 -07:00
Ivan Kozlovic
9a6a2c31ee [ADDED] JetStream: Ability to configure the per server max catchup bytes
The original value was hardcoded to 128MB and 32MB per stream. The
per-server limit is lowered to 32MB but is configurable with
a new configuration parameter:
```
jetstream {
   max_catchup: 8MB
}
```

The per-stream limit was also lowered from 32MB/128,000msgs to
8MB/32,000 messages.

Tests have shown no difference in performance for fast links.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-30 13:46:13 -06:00
Ivan Kozlovic
284e35132b Merge pull request #3387 from nats-io/fix_3317
[ADDED] Monitoring: TLS Peer Certificates in Connz when auth is on
2022-08-24 14:28:01 -06:00
Ivan Kozlovic
951b7c38f6 [ADDED] Monitoring: TLS Peer Certificates in Connz when auth is on
Add basic peer certificates information in /connz endpoint when
the "auth" option is provided.

Resolves #3317

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-22 11:48:49 -06:00
Ivan Kozlovic
f6c4e5fcee [CHANGED] Gateway: Switch all accounts to interest-only mode
We are phasing out the optimistic-only mode. Servers accepting
inbound gateway connections will switch the accounts to interest-only
mode.

The servers with outbound gateway connection will check interest
and ignore the "optimistic" mode if it is known that the corresponding
inbound is going to switch the account to interest-only. This is
done using a boolean in the gateway INFO protocol.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-19 16:41:44 -06:00
Ivan Kozlovic
7de4497815 Install consumer snapshot on clean exit and few other fixes
- didRemove in applyMetaEntries() could be reset when processing
multiple entries
- change "no race" test names to include JetStream
- separate raft nodes leader stepdown and stop in server
shutdown process
- in InstallSnapshot, call wal.Compact() with lastIndex+1

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-16 17:05:49 -06:00
Ivan Kozlovic
3c9a7cc6e5 Move to Go 1.19, remote io/util, fix data race and a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 09:55:37 -06:00
Ivan Kozlovic
5bc03c7637 Update leafNodeEnabled value on Start()
Maybe that is the place it could be set and not in NewServer(), but
want to minimize risk of breaking something close to 2.9.0

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-04 14:13:35 -06:00
Ivan Kozlovic
d84d9f8288 Use specific boolean for a leaf test instead of using leafNodeEnabled
A test TestJetStreamClusterLeafNodeSPOFMigrateLeaders was added at
some point that needed the remotes to stop (re)connecting. It made
use of existing leafNodeEnabled that was used for GW/Leaf interest
propagation races to disable the reconnect, but that may not be
the best approach since it could affect users embedding servers
and adding leafnodes "dynamically".

So this PR introduced a specific boolean specific for that test.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-04 10:00:11 -06:00
Matthias Hanel
ed2cb280cc Move to crypto/rand for nonce generation (#3324)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-08-03 06:52:26 +02:00
Matthias Hanel
d53d2d0484 [Added] account specific monitoring endpoint(s) (#3250)
Added http monitoring endpoint /accstatz
It responds with a list of statz for all accounts with local connections
the argument "unused=1" can be provided to get statz for all accounts
This endpoint is also exposed as nats request under:

This monitoring endpoint is exposed via the system account.
$SYS.REQ.ACCOUNT.*.STATZ
Each server will respond with connection statistics for the requested
account. The format of the data section is a list (size 1) identical to the event
$SYS.ACCOUNT.%s.SERVER.CONNS which is sent periodically as well as on
connect/disconnect. Unless requested by options, server without the account,
or server where the account has no local connections, will not respond.

A PING endpoint exists as well. The response format is identical to
$SYS.REQ.ACCOUNT.*.STATZ
(however the data section will contain more than one account, if they exist)
In addition to general filter options the request takes a list of accounts and
an argument to include accounts without local connections (disabled by default)
$SYS.REQ.ACCOUNT.PING.STATZ

Each account has a new system account import where the local subject
$SYS.REQ.ACCOUNT.PING.STATZ essentially responds as if
the importing account name was used for $SYS.REQ.ACCOUNT.*.STATZ

The only difference between requesting ACCOUNT.PING.STATZ from within
the system account and an account is that the later can only retrieve
statz for the account the client requests from.

Also exposed the monitoring /healthz via the system account under
$SYS.REQ.SERVER.*.HEALTHZ
$SYS.REQ.SERVER.PING.HEALTHZ
No dedicated options are available for these.
HEALTHZ also accept general filter options.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-07-12 21:50:32 +02:00
Ivan Kozlovic
5ed212ce04 Rework startupComplete gate from PR #2360
The "InProcess" change make readyForConnections() possibly return
much faster than it used to, which could cause tests to fail.

Restore the original behavior, but in case of DontListen option
wait on the startupComplete gate.

Also fixed some missing checks for leafnode connections.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-06-28 17:36:39 -06:00
Neil Alexander
558293e096 Fix the lock 2022-06-28 18:05:57 +01:00
Neil Alexander
cedf08a1b7 Commit properly 2022-06-28 17:03:42 +01:00
Neil Alexander
ff696f00d8 Remainder of time after waiting for startupComplete, add waitgroup done call after createClient 2022-06-28 17:03:42 +01:00
Neil Alexander
5b04c49df9 Re-add startupComplete channel, make readyForConnections wait for it 2022-06-28 17:03:42 +01:00
Neil Alexander
9190feb05f Review comments @kozlovic 2022-06-28 17:03:40 +01:00
Neil Alexander
90d7e007c0 Update comments (re. review) 2022-06-28 17:02:47 +01:00
Neil Alexander
e9abc5801e Add InProcessConn, DontListen 2022-06-28 17:02:47 +01:00
Derek Collison
92cd7821de Convert server mutex to RW.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-27 16:05:03 -07:00
Derek Collison
cc197771ec Allow compile and staticheck to pass.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-24 09:17:12 -07:00
Ivan Kozlovic
4bf81420e2 [FIXED] Fast routed JetStream API requests were dropped
If a JS API request is received from a non client connection, it
was processed in its own go routine. To reduce the number of
such go routine, we were limiting the number of outstanding routines
to 4096. However, in some situations, it was possible to issue
many requests at the same time that would then cause those requests
to be dropped.

(an example was an MQTT benchmark tool that would create 5000
sessions, each with one QoS1 R1 consumer (with the use of consumer_replicas=1).
On abrupt exit of the tool, the consumers and their sessions needed
to be deleted. Since would cause fast incoming delete consumer requests
which would cause the original code to drop some of them)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-23 11:15:55 -06:00
Ivan Kozlovic
3cdbba16cb Revert "[added] support for jwt operator option DisallowBearerToken" 2022-05-04 11:11:25 -06:00
Matthias Hanel
c9217bad33 review comments
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-29 20:00:37 -04:00
Matthias Hanel
bd2883122e [added] support for jwt operator option DisallowBearerToken
I modified an existing data structure that held a similar attribute already.
Instead this data structure references the claim.

change 3 out of 3. Fixes #3084
corresponds to:
https://github.com/nats-io/jwt/pull/177
https://github.com/nats-io/nsc/pull/495

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-29 14:18:11 -04:00
Matthias Hanel
d520a27c36 [fixed] step down timing, consumer stream seqno, clear redelivery (#3079)
Step down timing for consumers or streams.
Signals loss of leadership and sleeps before stepping down.
This makes it less likely that messages are being processed during step
down.

When becoming leader, consumer stream seqno got reset,
even though the consumer existed already.

Proper cleanup of redelivery data structures and timer

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-27 03:32:08 -04:00
Ivan Kozlovic
50c3986863 [FIXED] JetStream stream catchup issues
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 16:05:12 -06:00
Matthias Hanel
13e5ab10bd fix js nex interest check where leaf node masked gw subj propagation (#3016)
basically a gw subject propagation issue could be hidden behind a leaf
node.
also change error text when this was the case

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-11 14:04:09 -04:00
Ivan Kozlovic
366d217f44 Some changes based on review
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-01 17:55:33 -06:00
Ivan Kozlovic
19783a9f11 [CHANGED] Rate limit similar warnings
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.

This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-01 15:24:03 -06:00
Ivan Kozlovic
7bb7309f4c [FIXED] Monitoring: verify_and_map in tls{} config would break monitoring
This was introduced in v2.6.6. In order to solve a config reload
issue, we used tls.Config.GetConfigForClient which allowed the
TLS configuration to be "refreshed" with the latest. However, in
this case, the tls.Config.ClientAuth was not reset to tls.NoClientCert
which we need for monitoring port.

Resolves #2980

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-30 18:50:52 -06:00
Matthias Hanel
0c5f3688a7 [ADDED] Tiered limits and fix limit issues on updates (#2945)
* Adding tiered limits and fix limit issues on updates

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-28 20:47:54 -04:00
Ivan Kozlovic
91bdcc30cc [FIXED] Server version check
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-25 12:11:55 -06:00
Derek Collison
edcddfae58 Make at least work
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-24 19:12:31 -07:00
Derek Collison
1d38a73bcb Fix for version comparison
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-24 18:39:28 -07:00
Ivan Kozlovic
a23b1b73ef Merge pull request #2931 from nats-io/ipq_changes
Changes to IPQueues
2022-03-17 19:13:02 -06:00
Ivan Kozlovic
c3da392832 Changes to IPQueues
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-17 17:53:06 -06:00
Derek Collison
fa098f1af0 Show version on main monitoring page with link to source
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-17 11:04:11 -07:00
Ivan Kozlovic
63c750e295 [CHANGED] Gateway: Detect duplicate names between clusters
Gateway connection will be closed and error reported if a remote
has a name that is a duplicate of the local cluster.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-15 15:00:13 -06:00