Neil Twigg
979b265e26
Tweak timing in TestJetStreamClusterDeleteConsumerWhileServerDown
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-07-14 16:44:15 +01:00
Derek Collison
9e9a9a082b
When restoring a filestore with no key generator but it was encrypted, fail to restore.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-07-11 16:27:50 -07:00
Derek Collison
a2b9ee9123
Shorten stream size for travis
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-28 15:56:41 -07:00
Derek Collison
1bb1a3cae1
Do not health check streams that are actively being restored.
...
Could leave them in a bad state.
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-28 15:27:45 -07:00
Derek Collison
9eeffbcf56
Fix performance issues with checkAckFloor.
...
Bail early if new consumer, meaning stream sequence floor is 0.
Decide which linear space to scan.
Do no work if no pending and we just need to adjust which we do at the end.
Also realized some tests were named wrong and were not being run, or were in wrong file.
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-08 18:45:03 -07:00
Derek Collison
779978d817
Extended replay leafnode test to confirm mirror functionality
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-07 14:01:43 -07:00
Derek Collison
4ac45ff6f3
When consumers were R1 and the same name was reused, server restarts could try to cleanup old ones and effect the new ones.
...
These changes allow consumer name reuse more effectively during server restarts.
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-05 12:48:18 -07:00
Maurice van Veen
132567de39
Fix PurgeEx replay with sequence & keep succeeds
2023-06-04 11:56:28 +02:00
Derek Collison
dee532495d
Make sure to process extended purge operations correctly when being replayed on a restart.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-03 17:49:45 -07:00
Derek Collison
1bce79750e
When we were optimizing for single cluster but large number of leafnodes we inadvertently broke a daisy chained scenarion where a server was a spoke and a hub with a single hub cluster.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-06-02 15:16:36 -07:00
Derek Collison
734895ae47
Fix test flapper
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-16 12:20:18 -07:00
Derek Collison
b0340ce598
Make sure to wait properly until we believe we are caught up to enable direct gets.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-16 11:02:06 -07:00
Derek Collison
5e029d08d5
For older R1 streams created by previous servers we could have no cluster for the stream assignment group which would prevent scale up with newer servers.
...
This will inherit cluster if detected from placement tags or client cluster designation.
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-10 17:59:28 -07:00
Derek Collison
da8aeac91b
Fix flapper
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-03 21:00:17 -07:00
Derek Collison
21239022bd
Protect against usage drift for any unforseen reason and if detected correct.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-03 17:09:06 -07:00
Derek Collison
f098c253aa
Make sure we adjust accounting reservations when deleting a stream with any issues.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-01 15:54:37 -07:00
Derek Collison
f5ac5a4da0
Fix for a bug that could leave a raft node running when stopping a stream.
...
This can happen when we reset a stream internally and the stream had a prior snapshot.
Also make sure to always release resources back to the account regardless if the store is no longer present.
Signed-off-by: Derek Collison <derek@nats.io >
2023-05-01 13:22:06 -07:00
Derek Collison
546dd0c9ab
Make sure we can recover an underlying node being stopped.
...
Do not return healthy if the node is closed, and wait a bit longer for forward progress.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-29 07:42:23 -07:00
Derek Collison
d107ba3549
Under certain scenarios we have witnessed healthz() that never retrun healthy due to a stream or consumer being missing or stopped.
...
This will now allow the healthy call to attempt to restart those assets.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-28 17:11:08 -07:00
Derek Collison
7f06d6f5a7
When Jsz() was asked for consumer details, would report incorrect data if not a consumer leader.
...
This is due to the way state is maintained for leaders vs followers for consumers.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-26 15:03:15 -07:00
Derek Collison
c0f5b71a8f
Test that makes sure that assets that have been created under a certain cluster can be upgraded to a new cluster.
...
This is specifically when a cluster is reconfigured and the servers are restarted with a new cluster name.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-24 20:06:20 -07:00
Derek Collison
8b7c2d12aa
Run a check for ack floor drift when taking over as a leader and the ack go routine is spun up.
...
Also periodically check. If all normal will be very cheap.
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-21 11:59:35 -07:00
Derek Collison
7d3ec51d79
Fix for flapping test
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-04-03 14:46:59 -07:00
Ivan Kozlovic
a4df4f8727
Fixed some tests
...
Signed-off-by: Ivan Kozlovic <ivan@synadia.com >
2023-03-30 15:02:59 -06:00
Derek Collison
4646f4af5d
Do not allow any JetStream leaders to be placed on a lameduck server
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-29 20:15:41 -07:00
Derek Collison
02702e4620
[IMPROVEMENT] General stability and bug fixes. ( #3999 )
...
This PR has general improvements and fixes to filestore, raft, and the
clustering layer.
Summary
1. Additional support for preAck handling for interest based streams
when replicated acks arrive before the message itself.
2. Better handling when checking state to determine whether to remove an
interest based message.
3. Improved StepDown() and leadership transfer handling after restarts.
4. Improved voting logic for high load systems.
5. Various improvements and fixes for filestore Compact(), which is used
heavily in the raft layer when updating snapshots and the raft wal.
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-29 17:09:44 -07:00
Derek Collison
182bf6cbae
Bug fixes and general stability improvements.
...
1. If reset ignore Applied() that are greater then our commit.
2. Improved StepDown() by placing at back of queue if preferred.
3. Improved handling of leadership transfer during StepDown().
4. Do not store EntryLeaderTransfer records on disk.
5. Remove un-needed processing of older terms.
6. If append entry has higher term, also inherit pterm.
7. Only inherit a candidate's term if we decide to vote for them.
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-29 12:43:46 -07:00
Neil Twigg
8d5519356e
Shut down RAFT groups when disabling JetStream
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-03-23 16:54:01 +00:00
Derek Collison
9ccd7abdf8
Test for preAcks
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-21 12:08:24 -07:00
Derek Collison
5a16f98427
Fixed an off by one bug that under certain circumstances could cause large consumer replica states.
...
This could lead to instability in the system.
The bug would manifest in replicated consumers when certain messages could be acked out of order, and, the pending list would never go to zero.
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-19 10:41:59 -07:00
Derek Collison
f0e1585490
Fix flapping test
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-17 13:14:43 -07:00
Derek Collison
5bb6f167b9
Make sure to cleanup messages on a follower consumer for an interest based stream when the consumer leader sends a state snapshot.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-15 20:11:16 -07:00
Derek Collison
8dbfbbe577
Fix test
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-15 17:23:51 -07:00
Derek Collison
5a1878b015
Fix for workqueue stream scaling up and not removing acked messages.
...
Make sure when scaling up streams that are workqueue or interest policy that consumers scale as well.
Signed-off-by: Derek Collison <derek@nats.io >
2023-03-13 17:13:49 -07:00
Derek Collison
724160ebac
Fix flapping tests
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-28 14:30:23 -08:00
Derek Collison
6078706544
Fixup test for new parameters
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-27 18:56:55 -08:00
Tomasz Pietrek
02ba78454d
Fix new replicas late MaxAge expiry
...
This commit fixes the issue when scaling Stream with MaxAge
and some older messages stored. Until now, old messages were not properly
expired on new replicas, because new replicas first expiry timer
was set to MaxAge duration.
This commit adds a check if received messages expiry happens before
MaxAge, meaning they're messages older than the replica.
https://github.com/nats-io/nats-server/issues/3848
Signed-off-by: Tomasz Pietrek <tomasz@nats.io >
2023-02-24 00:46:02 +01:00
Neil Twigg
cfea34c80c
Install snapshot and compact when WAL grows, even when no state changes occur
2023-02-22 20:00:57 +00:00
Tomasz Pietrek
337a9f2cbd
Improve test for consumer with inactivity threshold
...
Signed-off-by: Tomasz Pietrek <tomasz@nats.io >
2023-02-19 17:57:09 +01:00
Derek Collison
06fd81d096
Fixed a bug where a named consumer under interest policy was spinning up inactive threshold timers in all replicas not just the leader.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-19 06:08:43 -08:00
Derek Collison
6a4c61e1a3
Merge branch 'main' into bad-consumer-delete
2023-02-18 11:09:56 -08:00
Derek Collison
01fa89a0b4
Fix for deleting consumers on restarts and non-fatal update errors.
...
If there was a spurious error on restart, or possibly on an update, we could delete a consumer which was the incorrect behavior.
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-18 09:46:52 -08:00
Derek Collison
efa3bcc49d
Parallel consumer creation could drop responses (create and info) and could also run monitorConsumer twice.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-02-18 05:16:05 -08:00
Waldemar Quevedo
4452f64d73
Fix TestJetStreamParallelConsumerCreation race
...
Signed-off-by: Waldemar Quevedo <wally@nats.io >
2023-02-15 17:23:48 -08:00
Derek Collison
390fd02918
Updates to tests for updated Go client changes
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-01-31 09:47:36 -08:00
Derek Collison
f4e6481ce7
Allow report cycles between source streams if subjects truly form a cycle.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-01-27 13:03:24 -08:00
Derek Collison
c7a75c5a6d
Merge pull request #3817 from nats-io/force-consumer-replicas
...
[FIXED] Force consumer replicas to match for interest policy streams
2023-01-26 09:39:15 -08:00
Derek Collison
3d78459ad1
Fixup for bad merge
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-01-26 09:09:30 -08:00
Neil Twigg
83932b4be6
Don't mark a clustered stream as unhealthy if making forward progress, add TestJetStreamClusterCurrentVsHealth
2023-01-26 16:57:34 +00:00
Derek Collison
d0a7a8169a
Merge branch 'main' into force-consumer-replicas
2023-01-26 08:35:49 -08:00