146 Commits

Author SHA1 Message Date
Ivan Kozlovic
1eb08505d4 [FIXED] Routes: Pinned Accounts connect/reconnect in some cases
The issue is with a server that has a route for a given account
but connects to a server that does not support it. The creation
of the route for this account will fail - as expected - and the
server will stop trying to create the route for this account.
But it needs to retry to create this route if it were to reconnect
to that same URL in case the server (or its config) is updated
to support a route for this account.

There was also an issue even with 2.10.0 servers in some gossip
situations. Namely, if server B is soliciting connections to A
(but not vice-versa) and A would solicit connections to C (but
not vice-versa). In this case, connections for pinned-accounts
would not be created.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-09-28 10:46:32 -06:00
Ivan Kozlovic
a84ce61a93 [FIXED] Account resolver lock inversion
There was a lock inversion but low risk since it happened during
server initialization. Still fixed it and added the ordering
in locksordering.txt file.

Also fixed multiple lock inversions that were caused by tests.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-09-25 15:09:11 -06:00
Derek Collison
f95ef63ae1 In lameduck mode shutdown jetstream at start, do not leave running during connection drain.
Signed-off-by: Derek Collison <derek@nats.io>
2023-09-24 14:42:59 -07:00
Waldemar Quevedo
ea775a80e8 Skip TestJetStreamClusterRestartThenScaleStreamReplicas for now
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2023-09-18 12:46:53 -07:00
Derek Collison
850c89e175 When scaling a consumer down make sure to pop the loopAndForwardProposals go routine
Signed-off-by: Derek Collison <derek@nats.io>
2023-09-18 12:26:25 -07:00
Waldemar Quevedo
27245891f2 Add test for scaling replica with pull consumers
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2023-09-18 12:26:05 -07:00
Derek Collison
9531611feb Add in utility to detect and delete any NRG orphans.
Signed-off-by: Derek Collison <derek@nats.io>
2023-09-11 19:15:12 -07:00
Neil Twigg
c539fb7e9a De-flake TestJetStreamClusterConsumerMaxDeliveryNumAckPendingBug by ignoring PushBound
Signed-off-by: Neil Twigg <neil@nats.io>
2023-09-05 15:11:21 +01:00
Derek Collison
ba5d9089b1 Tweak for flapper
Signed-off-by: Derek Collison <derek@nats.io>
2023-09-03 14:31:28 -07:00
Derek Collison
e11ddb8bfe Merge branch 'main' into dev
Signed-off-by: Derek Collison <derek@nats.io>
2023-09-02 14:22:57 -07:00
Derek Collison
34ae2bf4cb Fix for a bug that would make normal streams use the wrong block size.
Signed-off-by: Derek Collison <derek@nats.io>
2023-09-02 13:56:34 -07:00
Neil Twigg
487f58f16e Consumers inherit limits for max_ack_pending and inactive_threshold from stream
Signed-off-by: Neil Twigg <neil@nats.io>
2023-09-01 10:54:11 +01:00
Derek Collison
f1bf4127c5 Merge branch 'main' into dev 2023-08-25 11:03:54 -07:00
Tomasz Pietrek
6df4403913 Fix flaky TestJetStreamClusterConsumerFollowerStoreStateAckFloorBug
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-08-25 15:31:20 +02:00
Derek Collison
346c22788e Merge branch 'main' into dev 2023-08-24 16:20:46 -07:00
Derek Collison
48bf7ba151 When a consumer reached a max delivered condition, we did not properly synchronize the state such that on a restore or leader switch the ack pending could jump and be higher than max ack pending and block the consumer.
This propagates a delivered update and we updated the store state engine to do the right thing when the condition is reached.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-24 16:00:27 -07:00
Derek Collison
fb8525b713 Merge branch 'main' into dev
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:55:00 -07:00
Derek Collison
2fc3f45ea1 [FIXED] Durable pull consumers could get cleaned up incorrectly on leader change. (#4412)
Fix for a bug that would allow old leaders of pull based durables to
delete a consumer from an inactivity threshold timer inadvertently.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 15:35:44 -07:00
Derek Collison
43314fd439 Fix for a bug that would allow old leaders of pull based durables to delete a consumer from an inactivity threshold.
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-21 14:53:09 -07:00
Neil Twigg
7cc5838a6d Send shutdown event on LDM so that R1 assets do not get assigned to the LDM node
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-21 21:29:01 +01:00
Neil Twigg
c0636d117f Tweak consumer replica scaling, add unit test for orphaned consumer subjects
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-17 15:27:29 +01:00
Tomasz Pietrek
d105e68c96 Add consumer api action for create and update
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-08-07 08:28:21 +02:00
Derek Collison
8079495903 Merge branch 'main' into dev
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-04 10:15:35 -07:00
Derek Collison
081140ee67 When taking over make sure to sync and reset clfs for clustered streams.
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-03 10:41:10 -07:00
Derek Collison
42752ec551 Merge branch 'main' into dev
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-01 21:46:54 -07:00
Derek Collison
5c8db89506 Make sure we do not drift on accounting.
Three issues were found and resolved.

1. Purge replays after recovery could execute full purge.
2. Callback was registered without lock, which could lead to skew.
3. Cluster reset could stop stream store and recreate it, which could lead to double accounting.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-01 18:35:20 -07:00
Derek Collison
ecf0fff411 Merge branch 'main' into dev 2023-07-17 10:41:51 -07:00
Neil Twigg
979b265e26 Tweak timing in TestJetStreamClusterDeleteConsumerWhileServerDown
Signed-off-by: Neil Twigg <neil@nats.io>
2023-07-14 16:44:15 +01:00
Derek Collison
cda7bcd389 Merge branch 'main' into dev 2023-07-12 09:06:44 -07:00
Derek Collison
9e9a9a082b When restoring a filestore with no key generator but it was encrypted, fail to restore.
Signed-off-by: Derek Collison <derek@nats.io>
2023-07-11 16:27:50 -07:00
Derek Collison
4d7cd26956 Add in support for segmented binary stream snapshots.
Streams with many interior deletes was causing issues due to the fact that the interior deletes were represented as a sorted []uint64.
This approach introduces 3 sub types of delete blocks, avl bitmask tree, a run length encoding, and the legacy format above.
We also take into account large interior deletes such that on receiving a snapshot we can skip things we already know about.

Signed-off-by: Derek Collison <derek@nats.io>
2023-07-03 08:41:33 -07:00
Derek Collison
cf393140ab Merge branch 'main' into dev 2023-06-28 17:48:53 -07:00
Derek Collison
a2b9ee9123 Shorten stream size for travis
Signed-off-by: Derek Collison <derek@nats.io>
2023-06-28 15:56:41 -07:00
Derek Collison
1bb1a3cae1 Do not health check streams that are actively being restored.
Could leave them in a bad state.

Signed-off-by: Derek Collison <derek@nats.io>
2023-06-28 15:27:45 -07:00
Ivan Kozlovic
7ff0ea449a Fixed issues with leafnode compression negotiation
When a server would send an asynchronous INFO to a remote server
it would incorrectly contain compression information that could
cause issues with one side thinking that the connection should
be compressed while the other side was not.

It also caused the authentication timer to be incorrectly set
which would cause a disconnect.

Signed-off-by: Ivan Kozlovic <ijkozlovic@gmail.com>
2023-06-09 13:20:44 -06:00
Derek Collison
a1f03513d8 Merge branch 'main' into dev 2023-06-09 09:29:13 -07:00
Derek Collison
9eeffbcf56 Fix performance issues with checkAckFloor.
Bail early if new consumer, meaning stream sequence floor is 0.
Decide which linear space to scan.
Do no work if no pending and we just need to adjust which we do at the end.

Also realized some tests were named wrong and were not being run, or were in wrong file.

Signed-off-by: Derek Collison <derek@nats.io>
2023-06-08 18:45:03 -07:00
Derek Collison
b5c0170527 Turn off leaf compression to stabilize test for now
Signed-off-by: Derek Collison <derek@nats.io>
2023-06-08 04:37:07 -07:00
Derek Collison
fd082ee8a5 Merge branch 'main' into dev 2023-06-07 14:31:53 -07:00
Derek Collison
779978d817 Extended replay leafnode test to confirm mirror functionality
Signed-off-by: Derek Collison <derek@nats.io>
2023-06-07 14:01:43 -07:00
Derek Collison
f342f6a758 Merge branch 'main' into dev 2023-06-05 14:13:18 -07:00
Derek Collison
4ac45ff6f3 When consumers were R1 and the same name was reused, server restarts could try to cleanup old ones and effect the new ones.
These changes allow consumer name reuse more effectively during server restarts.

Signed-off-by: Derek Collison <derek@nats.io>
2023-06-05 12:48:18 -07:00
Derek Collison
af318be5db Merge branch 'main' into dev 2023-06-04 13:30:15 -07:00
Maurice van Veen
132567de39 Fix PurgeEx replay with sequence & keep succeeds 2023-06-04 11:56:28 +02:00
Derek Collison
30d9dfd305 Merge branch 'main' into dev 2023-06-03 18:17:28 -07:00
Derek Collison
dee532495d Make sure to process extended purge operations correctly when being replayed on a restart.
Signed-off-by: Derek Collison <derek@nats.io>
2023-06-03 17:49:45 -07:00
Derek Collison
df901dc1aa Merge branch 'main' into dev 2023-06-02 16:45:07 -07:00
Derek Collison
1bce79750e When we were optimizing for single cluster but large number of leafnodes we inadvertently broke a daisy chained scenarion where a server was a spoke and a hub with a single hub cluster.
Signed-off-by: Derek Collison <derek@nats.io>
2023-06-02 15:16:36 -07:00
Derek Collison
7760aa5107 Merge branch 'main' into dev 2023-05-16 14:01:57 -07:00
Derek Collison
734895ae47 Fix test flapper
Signed-off-by: Derek Collison <derek@nats.io>
2023-05-16 12:20:18 -07:00