Commit Graph

4265 Commits

Author SHA1 Message Date
Derek Collison
91edd1a8d0 With snapshots both streams are present on restart so sources or mirrors that have a subject change from the origin would not recover.
We now suppress that if we know we are recovering an existing stream.

Signed-off-by: Derek Collison <derek@nats.io>
2022-09-29 09:17:15 -06:00
Derek Collison
52b5cd12bb Allow meta layer to snapshot on a clean shutdown.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-29 09:17:12 -06:00
Derek Collison
fef702a688 [FIXED] bug in consumer names paging, did not honor limits and returned duplicate results.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-29 06:14:00 -07:00
Ivan Kozlovic
8d9c57ad44 [IMPROVED] Fan-out performance
There was an observed degradation (around 5%) for large fan out in
v2.9.0 compared to earlier release. This is because we added
accounting of the in/out messages for the account, which result
in 4 atomic operations, 2 for in and 2 for out, however, it means
that for a fan-out of say 100 matching subscriptions, it is now
2 + 2 * 100 = 202.

This PR rework how the stats accounting is done which removes
the regression and even boost a bit the numbers since we are
doing the server stats update as an aggregate too.

There are still degradation for queues and no-sub at all that
need to be looked at.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-27 19:43:32 -06:00
Ivan Kozlovic
8247ecbf20 Merge pull request #3502 from nats-io/fix_3493
[FIXED] JetStream: Scale down of consumer to R1 would not get a response
2022-09-27 15:33:38 -06:00
Ivan Kozlovic
e151cfcd57 [FIXED] JetStream: Scale down of consumer to R1 would not get a response
Updating a consumer configuration from say R3 to R1 would work
but no response was received by the client sending the request.

Resolves #3493

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-27 10:02:31 -06:00
Derek Collison
2f1f6221bc Make sure to clear fss state on disk
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-26 19:53:19 -07:00
Ivan Kozlovic
c4bd813fab [FIXED] JetStream: File store memory usage
The write cache may be pinned for longer than needed when creating
a new write block. This could be seen in some benchmark tests.

The old block cache would be kept for 5 more seconds, which, with
a fast rate of inserts could start to show in some memory profiling.

This was a change introduced in https://github.com/nats-io/nats-server/pull/3351
which was different than code in v2.8.4

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-26 19:06:12 -06:00
Ivan Kozlovic
08968287d5 [FIXED] JetStream: prevent panic on consumer assignment
It could be that while the routine processing the consumer assignment
runs the stream is being stopped, which would lead to a panic.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-26 13:11:35 -06:00
Caleb Lloyd
3babdda3bb [FIXED] Format protocol error []byte with %q
Protocol errors print arguments that contain arbitrary []byte
and are possibly not formattable strings; use %q to escape

Signed-off-by: Caleb Lloyd <caleb@synadia.com>
2022-09-26 13:52:56 -04:00
Ivan Kozlovic
74c0b18fd2 Bump version to 2.9.2-beta as per release process
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-22 15:33:36 -06:00
Ivan Kozlovic
a9293979b0 Release v2.9.1
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-22 14:32:18 -06:00
Ivan Kozlovic
cfd4f7d5b3 [FIXED] LeafNode: connecting using websocket and no_auth_user
If the `no_auth_user` is set in the `websocket{}` block and a
server creates a leafnode connection using the websocket port,
and does not provide credentials, that no_auth_user should be
used, but was not.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-22 10:35:36 -06:00
Derek Collison
ac88662a89 Bump to 2.9.1-Beta.2
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 08:27:50 -07:00
Derek Collison
c65ef5c1b3 Merge pull request #3487 from nats-io/discard-new-per
Allow discard new per subject for certain KV type scenarios.
2022-09-22 08:26:10 -07:00
Derek Collison
9774ad5641 Added check on publish error.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 07:13:57 -07:00
Derek Collison
61a3cff274 Also require MaxMsgsPerSubject to be set per peer review feedback.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 06:56:32 -07:00
Derek Collison
2d737edba6 Allow discard new per subject for certain KV type scenarios. Requires general DiscardNewPolicy.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 06:38:29 -07:00
Derek Collison
7e1bc54389 Fix for #3848.
When a block's subject meta state was swapped out and subsequently loaded back in with only one subject present, but other messages with different subjects were added later, a filtered get could return the wrong result.

Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 04:57:05 -07:00
Ivan Kozlovic
52d7b481c4 Bump version to 2.9.1-beta.1
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-20 09:22:20 -06:00
Derek Collison
8b2315eadd When filtering a source stream use new consumer create API subject.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-16 14:45:33 -07:00
Ivan Kozlovic
170ff49837 [ADDED] JetStream: peer (the hash of server name) in statsz/jsz
A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something
like this:
```
...
    "meta_cluster": {
      "name": "local",
      "leader": "A",
      "peer": "NUmM6cRx",
      "replicas": [
        {
          "name": "B",
          "current": true,
          "active": 690369000,
          "peer": "b2oh2L6w"
        },
        {
          "name": "Server name unknown at this time (peerID: jZ6RvVRH)",
          "current": false,
          "offline": true,
          "active": 0,
          "peer": "jZ6RvVRH"
        }
      ],
      "cluster_size": 3
    }
```
Note the "peer" field following the "leader" field that contains
the server name. The new field is the node ID, which is a hash of
the server name.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-16 15:31:37 -06:00
Ivan Kozlovic
09cdec793d Merge pull request #3474 from nats-io/js_max_bytes_close_req
[FIXED] JetStream: Pull requests closed due to max_bytes were silent
2022-09-16 15:23:25 -06:00
Ivan Kozlovic
378fed164d [FIXED] JetStream: possible panic on peer remove on server shutdown
This was discovered by new test TestJetStreamClusterRemovePeerByID.
I saw this on Travis and repeating the test locally with -count=10
I was able to reproduce. The issue is cc.meta being nil but accessing
cc.meta.ID() directly.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-16 15:06:58 -06:00
Ivan Kozlovic
ff0bda415b [FIXE] JetStream: Pull requests closed due to max_bytes were silent
If the client pull requests has a max_bytes value and the server
cannot deliver a single message (because size is too big), it
is sending a 409 to signal that to the client library. However,
if it sends at least a message then it would close the request
without notifying the client with a 409, which would cause the
client library to have to wait for its expiration/timeout.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-15 16:55:41 -06:00
Ivan Kozlovic
dc2e4b714a Merge pull request #3473 from nats-io/js_raft_remove_by_peer_id
[ADDED] JetStream: ability to remove a server by peer ID instead of name
2022-09-15 13:52:20 -06:00
Ivan Kozlovic
3fadccab38 Move new test to new jetstream_cluster_3_test.go file
Since the second batch was already past the 5min mark and a bit
longer than the first batch, it is a good opportunity to add
this new test in a new file. Updated runTestsOnTravis and travis.yml
accordingly.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-15 12:13:00 -06:00
Ivan Kozlovic
f113163b9f Change ByID boolean to Peer string and add Peer id in replicas output
The CLI will now be able to display the peer IDs in MetaGroupInfo
if it choses to do so, and possibly help user select the peer ID
from a list with a new command to remove by peer ID instead of
by server name.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-15 10:39:23 -06:00
Ivan Kozlovic
e1f0361b98 [ADDED] JetStream: ability to remove a server by peer ID instead of name
This can be helpful after a partial cluster restart since in that
case the server name may not be known. However "server report jetstream"
would report the peer ID that then can be used.

For instance here is the output after a cluster restart where server "C"
is not restarted.

```
nats -s nats://sys:pwd@localhost:4222 server report jetstream
...
╭────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                  RAFT Meta Group Information                                   │
├─────────────────────────────────────────────────────┬────────┬─────────┬────────┬────────┬─────┤
│ Name                                                │ Leader │ Current │ Online │ Active │ Lag │
├─────────────────────────────────────────────────────┼────────┼─────────┼────────┼────────┼─────┤
│ A                                                   │ yes    │ true    │ true   │ 0.00s  │ 0   │
│ B                                                   │        │ true    │ true   │ 0.53s  │ 0   │
│ Server name unknown at this time (peerID: jZ6RvVRH) │        │ false   │ false  │ 0.00s  │ 0   │
╰─────────────────────────────────────────────────────┴────────┴─────────┴────────┴────────┴─────╯
```

With a change to the NATS CLI we could have something like:
```
nats -s nats://sys:pwd@localhost:4222 server raft peer-remove jZ6RvVRH --by_id
```

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-14 18:10:26 -06:00
Derek Collison
2aaf22b0de For ackMsg, make sure sequence is still relevant as well.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-14 16:47:35 -07:00
Derek Collison
6c97733bb8 Optimize needAck.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-14 16:25:50 -07:00
Ivan Kozlovic
c8ea439e21 Merge pull request #3471 from nats-io/jarema/durable-and-name-mismatch
Add error if Consumer Name and Durable are not equal
2022-09-14 13:26:19 -06:00
Tomasz Pietrek
dbf7636e15 Add error if Consumer Durable and Name are not equal
This error will happen only if both Name and Durable are specified.

Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2022-09-14 20:31:18 +02:00
Deepak
e9ce118c56 Fix peer randomisation when creating consumers groups for replica=1
Signed-off-by: Deepak <sah.sslpu@gmail.com>
2022-09-14 13:58:49 +05:30
Ivan Kozlovic
a41af2bdcb Merge pull request #3463 from nats-io/jnm/fix_mapping_split
[FIXED] Edge condition handling in {{Split()}} subject mapping function
2022-09-09 14:39:19 -06:00
jnmoyne
a1f90b8776 Fixes mishandling of an edge condition in the {{Split()}} subject mapping function 2022-09-09 12:42:03 -07:00
Ivan Kozlovic
29224c8ea9 Split more tests to speed up Travis run
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-09 12:45:48 -06:00
Ivan Kozlovic
925f57ccc2 Bump version to 2.9.1-beta
As per the release process, bumping the version to next update
with beta suffix once the release is out.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-09 10:17:26 -06:00
Ivan Kozlovic
b979556556 Release v2.9.0
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-09 08:23:48 -06:00
Matthias Hanel
f7cb5b1f0d changed format of JSClusterNoPeers error (#3459)
* changed format of JSClusterNoPeers error

This error was introduced in #3342 and reveals to much information
This change gets rid of cluster names and peer counts.

All other counts where changed to booleans,
which are only included in the output when the filter was hit.

In addition, the set of not matching tags is included.
Furthermore, the static error description in server/errors.json 
is moved into selectPeerError

sample errors:
1) no suitable peers for placement, tags not matched ['cloud:GCP', 'country:US']"
2) no suitable peers for placement, insufficient storage

Signed-off-by: Matthias Hanel <mh@synadia.com>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Co-authored-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-08 18:25:48 -07:00
Derek Collison
fdf52554c7 Bump to 2.9.0-RC.19
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-08 12:08:48 -07:00
Derek Collison
d979937bbd Merge pull request #3456 from nats-io/max-bytes-pull
[IMPROVED] Pull request logic
2022-09-08 12:08:10 -07:00
Derek Collison
dedf21d45d Fix for issue #3455
When hitting max ack pending from getNextMsg would remove one shots incorrectly.

Signed-off-by: Derek Collison <derek@nats.io>
2022-09-08 11:56:57 -07:00
Derek Collison
b32814d5fd Better accounting for max-bytes for pull consumers
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-08 11:56:57 -07:00
Ivan Kozlovic
ae0d808f5b Merge pull request #3457 from nats-io/cleanup_tests
Fixed some tests
2022-09-08 12:24:07 -06:00
jnmoyne
95c1946231 Implements pagination for JS Stream Info requests 2022-09-08 10:45:20 -07:00
Ivan Kozlovic
b69ffe244e Fixed some tests
Code change:
- Do not start the processMirrorMsgs and processSourceMsgs go routine
if the server has been detected to be shutdown. This would otherwise
leave some go routine running at the end of some tests.
- Pass the fch and qch to the consumerFileStore's flushLoop otherwise
in some tests this routine could be left running.

Tests changes:
- Added missing defer NATS connection close
- Added missing defer server shutdown

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-08 11:28:23 -06:00
Matthias Schneider
a58a7bf1ec server: expire display Never instead of 1970 2022-09-08 10:03:04 +02:00
Derek Collison
a7744026e1 Bump to 2.9.0-RC.18
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-07 19:35:56 -07:00
Ivan Kozlovic
8c1c6951dc JetStream: R1 durables were incorrectly migrated on shutdown
This could happen for stream with R>1 but with a durable that
has an override of R=1.

Fixed a test to make sure assets have an elected leader.

Also fixed a gateway test that would cause a data race.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-07 19:49:16 -06:00