Commit Graph

4525 Commits

Author SHA1 Message Date
Derek Collison
bee149b458 Only need server's rlock here.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-28 11:17:29 -08:00
Derek Collison
aad8aa6f21 Do not need lock to grab js here
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
fa8afba68f Only warn on write errors if not closed in case they linger under pressure and blocking on dios
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
576d31748f Sometimes do force meta snapshot
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
6078706544 Fixup test for new parameters
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
9721309601 Do not allow meta snapshot processing during recovery to override.
Make sure to process all stream updates during recovery through the ru structure.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
2642a8c03d Optimize locking for when under heavy loads.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
13167f46b9 Optimize some locking for when under heavy loads.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
3cebd26ef9 Optimize for high IO workloads. When we know optional metadata will always be correct on restart do not require inline IO all the time.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 18:56:55 -08:00
Derek Collison
2711460b7b Prevent benign spin between competing leaders with same index but differen term.
Remove lock from route processing for updating peers progress, altready handled in trackPeer.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 11:21:33 -08:00
Derek Collison
43916290df Make minimum snapshot time for all assets 10s.
Do not lock on clustered test for JetStream, not needed.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-27 11:20:37 -08:00
Derek Collison
4fa0ea32c3 [FIXED] If a truncate for a raft WAL failed we could spin.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-25 19:07:27 -08:00
Derek Collison
daacbf5580 Added optimized store NumPending() call.
Optimized and fixed a bug in filestore filteredPending().
Optimized memstore FilteredState().

Added comprehensive tests for NumPending() and FilteredState().

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-25 17:26:26 -08:00
Derek Collison
24c2f3b452 Improved performance of subjects details for stream info.
This version avoids all disk IO in the filestore version.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-24 17:22:18 -08:00
Derek Collison
87a1846822 Merge pull request #3907 from nats-io/raft-fix
[FIXED] Snapshots would not compact through applied.
2023-02-23 22:44:45 -08:00
Derek Collison
ea2bfad8ea Fixed bug where snapshot would not compact through applied. This mean a subsequent request for exactly applied would return that entry only not the full state snapshot.
Fixed bug where we would not snapshot when we should.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-23 22:19:37 -08:00
Tomasz Pietrek
02ba78454d Fix new replicas late MaxAge expiry
This commit fixes the issue when scaling Stream with MaxAge
and some older messages stored. Until now, old messages were not properly
expired on new replicas, because new replicas first expiry timer
was set to MaxAge duration.
This commit adds a check if received messages expiry happens before
MaxAge, meaning they're messages older than the replica.
https://github.com/nats-io/nats-server/issues/3848

Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-02-24 00:46:02 +01:00
Derek Collison
45859e6476 Make sure preferred peer for stepdown is healthy.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-23 13:06:13 -08:00
Derek Collison
d347cb116a When becoming leader optionally send current snapshot to followers if caught up.
This can help sync on restarts and improve ghost ephemerals. Also added more code to suppress respnses and API audits when we know we are recovering.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-23 10:30:36 -08:00
Derek Collison
f2c5e7ec02 Bump to 2.9.15-RC.2
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-22 19:54:13 -08:00
Derek Collison
2972c11be6 Improve consumer create performance.
In cases where we had a large subject space, a filestore with many msg blocks, and a filtered consumer with a wildcard filtered subject, creating a consumer could take more memory and time then we wanted.
This improvement works when the consumer is DeliverAll and we used the upper layer in memory psim structure to scan but only in memory and avoid a file read for each msg block.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-22 19:42:02 -08:00
Derek Collison
f16a7d8559 Skip test for now
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-22 15:49:48 -08:00
Derek Collison
d03d8e9d93 When having a max msgs per subject (e.g. KV) under heavy concurrent usage could skew the accounting for the underlying filestore.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-22 12:50:43 -08:00
Neil Twigg
cfea34c80c Install snapshot and compact when WAL grows, even when no state changes occur 2023-02-22 20:00:57 +00:00
Derek Collison
b19077c2e9 Bump to 2.9.15-RC.1
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-21 08:35:30 -08:00
Neil Twigg
68961ffedd Refactor ipQueue to use generics, reduce allocations 2023-02-21 14:50:09 +00:00
Derek Collison
18b5aca499 Merge pull request #3892 from nats-io/consumer-fixes
[FIXED] Consumer fixes and improvements on state management.
2023-02-20 19:17:39 -07:00
Derek Collison
3c64d07691 Warn of consumer state update failures.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-20 17:28:11 -08:00
Derek Collison
e028b7230a Need to compact wal on snapshot to pindex+1
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-20 14:37:37 -08:00
Derek Collison
d2179e0939 Make sure to also cleanup pending if below our stream ack floor
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-20 11:56:37 -08:00
Derek Collison
b6149c51f0 Make sure to clean up redelivered state on purge.
Make sure to update ack floors on messages being expired out from underneath of us.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-20 11:16:12 -08:00
Maurice van Veen
05695d304c Fixed a bug where partition was used with multiple wildcard token position 2023-02-20 10:27:29 +01:00
Derek Collison
6c9a9fb45e Fixed bug that would lose ack pending state during partial stream purge.
General code cleanup to be more correct.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-19 14:21:53 -08:00
Derek Collison
5c6b3b620a Merge pull request #3885 from nats-io/jarema/improve-interest-test
Improve test for consumer with inactivity threshold
2023-02-19 10:36:36 -07:00
Tomasz Pietrek
337a9f2cbd Improve test for consumer with inactivity threshold
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-02-19 17:57:09 +01:00
Waldemar Quevedo
beb179ec15 Check if connection name was already set when storing it
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-02-19 07:58:56 -08:00
Derek Collison
06fd81d096 Fixed a bug where a named consumer under interest policy was spinning up inactive threshold timers in all replicas not just the leader.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-19 06:08:43 -08:00
Derek Collison
e270e9538f Do not warn if consumer replicas condigured to 0
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-18 11:50:26 -08:00
Derek Collison
6a62ac4560 Fix for merge conflict
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-18 11:12:15 -08:00
Derek Collison
6a4c61e1a3 Merge branch 'main' into bad-consumer-delete 2023-02-18 11:09:56 -08:00
Derek Collison
01fa89a0b4 Fix for deleting consumers on restarts and non-fatal update errors.
If there was a spurious error on restart, or possibly on an update, we could delete a consumer which was the incorrect behavior.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-18 09:46:52 -08:00
Derek Collison
efa3bcc49d Parallel consumer creation could drop responses (create and info) and could also run monitorConsumer twice.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-18 05:16:05 -08:00
Derek Collison
2d794d09e1 Fix to flapping test to make sure we do not quickly blow away all consumer state.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-17 14:23:34 -08:00
Derek Collison
11b0f214d0 Do not re-calculate NumPending on consumer info calls.
We noticed this was being called alot in user environments.
When the consumer was filtered with a wilcard and the stream had a high cardinality of subjects and was falling behind this could take a substantial amount of time.

Signed-off-by: Derek Collison <derek@nats.io>
2023-02-16 16:30:14 -08:00
Derek Collison
0cb01f9e7a Make sure we update storage accounting on extended version purge for filestore.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-16 13:18:40 +04:00
Derek Collison
b3b9e888f3 Merge pull request #3873 from nats-io/diskio-test
[FIXED] Adjusted test to correspond to new limit of 1024.
2023-02-15 20:45:15 -07:00
Derek Collison
32b5ec16dd Fixed test to correspond to new limit of 1024.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-16 07:16:19 +04:00
Waldemar Quevedo
4452f64d73 Fix TestJetStreamParallelConsumerCreation race
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-02-15 17:23:48 -08:00
Derek Collison
345496f331 Merge pull request #3867 from nats-io/improvements
Improvements to Filestore
2023-02-14 05:35:40 -07:00
Derek Collison
3bc0af70d0 Only update per subject information if we know we have an update.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-13 20:12:35 +02:00